Gain of function mutations in ATP-dependent transposition proteins

ABSTRACT

The invention is specifically directed to efficient, random, simple insertion of a transposon or derivative transposable element into DNA in vivo or in vitro. The invention is particularly directed to mutations in ATP-utilizing regulatory transposition proteins that permit insertion with less target-site specificity than wild-type. The invention encompasses gain-of-function mutations in TnsC, an ATP-utilizing regulatory transposition protein that activates the bacterial transposon Tn7. Such mutations enable the insertion of a Tn7 transposon or derivative transposable element in a non-specific manner into a given DNA segment. Insertion can be effected in plasmid and cosmid libraries, cDNA libraries, PCR products, bacterial artificial chromosomes, yeast artificial chromosomes, mammalian artificial chromosomes, genomic DNAs, and the like. Such insertion is useful in DNA sequencing methods, for genetic analysis by insertional mutagenesis, and alteration of gene expression by insertion of a given genetic sequence.

CROSS-REFERENCE TO RELATED APPLICATIONS

[0001] This application is a continuation of U.S. application Ser. No.09/027,169, filed Feb. 20, 1998, which claims priority to provisionalapplication no. 60/037,955 filed on Feb. 20, 1997, the teachings ofwhich are hereby incorporated herein in their entirety by reference.

BACKGROUND OF THE INVENTION

[0002] The invention is specifically directed to efficient, random,simple insertion of a transposon or derivative transposable element intoDNA in vivo or in vitro. The invention is particularly directed tomutations in ATP-utilizing regulatory transposition proteins that permitinsertion with less target-site specificity than wild-type. Theinvention encompasses gain-of-function mutations in TnsC, anATP-utilizing regulatory transposition protein that activates thebacterial transposon Tn7. Such mutations enable the insertion of a Tn7transposon or derivative transposable element in a non-specific mannerinto a given DNA segment. Insertion can be effected in plasmid andcosmid libraries, cDNA libraries, PCR products, bacterial artificialchromosomes, yeast artificial chromosomes, mammalian artificialchromosomes, genomic DNAs, and the like. Such insertion is useful in DNAsequencing methods, for genetic analysis by insertional mutagenesis, andalteration of gene expression by insertion of a given genetic sequence.

[0003] Description of the Background Art

[0004] Transposable elements are discrete segments of DNA capable ofmobilizing nonhomologously from one genetic location to another, thattypically carry sequence information important for two main functionsthat confer the ability to mobilize. They encode the proteins necessaryto carry out the catalytic activity associated with transposition, andcontain the cis-acting sequences, located at the transposon termini,that act as substrates for these proteins. The same proteins canparticipate in the selection of the target site for insertion.

[0005] The selection of a new insertion site is usually not a randomprocess; instead, many transposons show characteristic preferences forcertain types of target sites. One broad characteristic thatdifferentiates the wide variety of transposable elements known is thenature of the target site selectivity (1). A component of thisselectivity can be the target sequence itself. The bacterial transposonTn10 preferentially selects a relatively highly conserved 9 bp motif asthe predominant site for transposon insertion and less often selectsother more distantly related sites in vivo (2). The Tc1 and Tc3 marinerelements of C. elegans insert preferentially at a TA dinucleotide suchthat each end of the element is flanked by a TA duplication (3) (4) (5).A lower specificity consensus sequence, N-Y-G/C-R-N has been determinedfrom populations of both in vivo and in vitro insertions for thebacteriophage Mu (7). In contrast to these elements, the bacterialtransposon Tn5 exhibits markedly lower insertion site specificity,although some isolated “hotspots” have been detected (8).

[0006] Another selection mechanism relies on structural features orpresence of cellular protein complexes at the target sites. The yeasttransposon Ty3 preferentially inserts into the promoters of genestranscribed by RNA polymerase III, responding to signals from cellularproteins TFIIIB and TFIIIC (9).

[0007] Understanding how these factors modulate transposase activity toimpose target site preferences will lend insight into the spread oftransposons and viruses, and may suggest ways to manipulate those targetpreferences. The bacterial transposon Tn7 is distinctive in that it usesseveral element-encoded accessory proteins to evaluate potential targetDNAs for positive and negative features, and to select a target site(1). Tn7 encodes five genes whose protein products mediate itstransposition (10) (11).

[0008] Two of the proteins, TnsA and TnsB, constitute the transposaseactivity, collaborating to execute the catalytic steps of strandbreakage and joining (12). The activity of this transposase is modulatedby the remaining proteins, TnsC, TnsD, and TnsE, and also by the natureof the target DNA.

[0009] TnsC, TnsD, and TnsE interact with the target DNA to modulate theactivity of the transposase via two distinct pathways. TnsABC+TnsDdirects transposition to attTn7, a discrete site on the E. colichromosome, at a high frequency, and to other loosely related “pseudoatt” sites at low frequency (13). The alternative combination TnsABC+Edirects transposition to many unrelated non-attTn7 sites in thechromosome at low frequency (13) (10) (11) and preferentially toconjugating plasmids (14). Thus, attTn7 and conjugable plasmids containpositive signals that recruit the transposon to these target DNAs. Thealternative target site selection mechanisms enable Tn7 to inspect avariety of potential target sites in the cell and select those mostlikely to ensure its survival.

[0010] The Tn7 transposition machinery can also recognize and avoidtargets that are unfavorable for insertion. Tn7 transposition occursonly once into a given target molecule; repeated transposition eventsinto the same target are specifically inhibited (15) (16). Therefore, apre-existing copy of Tn7 in a potential target DNA generates a negativesignal which renders that target “immune” to further insertion. Thenegative target signal affects both TnsD- and TnsE-activatedtransposition reactions and is dominant to any positive signals presenton a potential target molecule (16). Several other transposons, such asMu and members of the Tn3 family, also display this form of negativetarget regulation (17) (18) (19) (7).

[0011] Target selection could be an early or late event in the course ofa transposition reaction. For example, a transposon could constitutivelyexcise from its donor position, and the excised transposon could then becaptured at different frequencies by different types of targetmolecules. Tn10 appears to follow this course of events in vitro,excising from its donor position before any interactions with target DNAoccur (20) (21). Alternatively, the process of transposon excision coulditself be dependent on the identification of a favorable target site.Tn7 transposition shows an early dependence on target DNA signals invitro: neither transposition intermediates nor insertion products areseen in the absence of an attTn7 target (22). Thus, the nature of thetarget DNA appears to regulate the initiation of Tn7 transposition invitro.

[0012] An important question is how positive and negative target signalsare communicated to the Tn7 transposase. Reconstitution of theTnsABC+TnsD reaction in vitro has provided a useful tool for detaileddissection of Tn7 transposition (22) (23). This reaction has beeninstrumental in delineating the role of each of the individual proteinsplay in target site selection. Dissection of the TnsABC+D reaction invitro has implicated TnsC as a pivotal connector between the TnsABtransposase and the target DNA. TnsC is an ATP-dependent DNA-bindingprotein with no known sequence specificity (24). However, TnsC canrespond to signals from attTn7 via an interaction with the site-specificDNA-binding protein TnsD. In a standard in vitro transposition reactionTnsD is required for transposition to the attTn7 site on a target DNAmolecule. This site-specific insertion process is tightly regulated byTnsC, but does not occur in the absence of TnsD. Additional evidence fora TnsC-TnsD interaction comes from DNA protection and band shiftanalysis with attTn7 DNA (23). Direct interaction between TnsC and theTnsAB transposase has also recently been observed (25) (26).

[0013] Therefore, TnsC may serve as a “connector” or “matchmaker”between the transposase and the TnsD+attTn7 target complex (23) (27).This connection is not constitutive, but instead appears to be regulatedby the ATP state of TnsC. Only the ATP-bound form of TnsC is competentto interact with target DNAs and activate the TnsA+B transposase; theADP-bound form of TnsC has neither of these activities and cannotparticipate in Tn7 transposition (24) (23). TnsC hydrolyzes ATP at amodest rate (25), and therefore can switch from an active to an inactivestate. The modulation of the ATP state of TnsC may be a centralmechanism for regulating Tn7 transposition.

[0014] The possibility that TnsC regulates the connection between theTnsA+B transposase and the target site prompted the inventor to predictthat TnsC mutants can be isolated that would constitutively activate Tn7transposition.

[0015] TnsC therefore became an excellent candidate for mutagenesis, tosearch for a gain of function protein capable of circumventing therequirement for targeting proteins. The inventor therefore identifiedgain-of-function TnsC mutants which can activate the TnsA+B transposasein the absence of TnsD or TnsE. They have characterized the ability ofthese mutants to promote insertions into various targets, and to respondto regulatory signals on those targets.

[0016] One class of TnsC mutants activates transposition in a way thatis still sensitive to target signals, whereas a second class of TnsCmutants activates transposition in a way that appears to bypass targetsignals. As had been observed in vitro, the critical communicationbetween the transposon and the target DNA appears to be an early eventin the Tn7 reaction pathway in vivo, preceding the double-strand breaksat the transposon ends that initiate transposition.

[0017] A particular mutant isolated from the random mutagenesis isTnsC^(A225V), a mutant capable of an impressive activation of Tn7transposition in the absence of TnsD (25). The single amino acidsubstitution made to generate TnsC^(A225V) has altered the protein suchthat it no longer requires an interaction with the target-associatedTnsD, enabling it to activate transposition to a variety of targetmolecules very efficiently (25) (26). The inventor concluded thatTnsC^(A225V) could promote transposition to target DNAs with lowspecificity based on results where transposition driven by theTnsABC^(A225V) machinery was directed to either F plasmids containing anattTn7 site, F plasmids lacking an attTn7 site, or the E. colichromosome with no apparent preference.

[0018] DNA Sequencing

[0019] Sequencing DNA fragments cloned into vectors requires provisionof priming sites at distributed locations within the fragment ofinterest, if the fragment is larger than the sequence run length (amountof sequence that can be determined from a single sequencing reaction).At present there are three commonly used methods of providing thesepriming sites:

[0020] A) Design of a new primer from sequence determined in a previousrun from vector-encoded primer or other previously determined primer(prime and run, primer walking)

[0021] B) Random fragmentation and reckoning of smaller pieces, followedby determination of the sequence of the smaller pieces fromvector-encoded (universal) priming sites, followed by sequence assemblyby overlap of sequence (random shotgun sequencing).

[0022] C) Deletion of variable amounts of the fragment of interest froman end adjacent to the vector, to bring undetermined fragment sequenceclose enough to the vector-encoded (universal) primer to allow sequencedetermination.

[0023] All of these methods have disadvantages.

[0024] Method A is time-consuming and expensive because of the delayinvolved in design of new primers and their cost. Moreover, if thefragment contains DNA repeats longer than the sequence run, it may beimpossible to design a unique new primer; sequence runs made withprimers within the repeat sequence will display two or more sequencesthat cannot be disentangled.

[0025] Method B requires recloning; random fragmentation is difficult toachieve because fragments that are efficiently clonable (restrictionenzyme digestion) do not have ends randomly distributed (Adams, M. D.,Fields, C. and Ventor, J. C. editors Automated DNA Sequencing andAnalysis Academic Press 1994; Chapter 6, Bodenteich, K. et al.), andfragmentation methods that provide randomly distributed ends (shearing,sonication) do not provide DNA ends that are efficiently clonable (with5′ phosphate and 3′ OH moieties). Sequence assembly of is also difficultor impossible when two or more repetitive sequences longer than thesequence run are present in the starting fragment.

[0026] Method C depends on providing randomly distributed end points forenzymatically-determined deletions. There are many methods for makingsuch deletions (especially those involving exonuclease digestions,typically Exonuclease III), none of which provide entirely randomendpoints and which depend on the presence of unique suitablerestriction enzyme sites at one or both ends of the cloned fragment.However, because the deletion series in principle allows construction ofa map (of nested remaining fragment lengths in deletion derivatives)that is independent of the sequence itself, this method can allowrepetitive sequence longer than the sequence run to be located withinthe fragment at appropriate locations.

[0027] A method for introduction of universal priming sites at randomlydistributed locations within a fragment of interest is therefore auseful advance in sequencing technology.

[0028] Transposition and the Sequencing Problem.

[0029] Previous efforts have been made to provide distributed primingsites by means of transposable elements. These methods have fallen shortof this goal in three ways: first, the transposable elements have notprovided a sufficiently random distribution of priming sites; second,the transposition method (carrying out transposition in vivo, followedby recovery of the targeted DNA and repurification) has beentime-consuming and laborious; third, the Systems have been prone toproduce undesired products. These undesired products include but are notlimited to: a) cointegrates (replicon fusions) between the donor of thetransposon and the target plasmid; b) insertions in which the two endsof the transposon act at different positions (leading to deletion of theintervening target); c) insertions of multiple copies of the transposoninto the target, so that priming from one end of the transposon yieldstwo superimposed sequences. The method has been laborious in two ways:the majority of insertions have been into chromosomal DNA of the host,and even for those insertions into the plasmid the recovery method hasentailed loss of independence of insertions. in vitro methods ofinsertion have suffered from both the non-random location of insertionsites and the undesired products, and also from poor efficiency, so thatit has been impractical to obtain large numbers of insertions into thetarget of interest without excessive labor.

[0030] Increasing interest in large scale sequencing projects and aconcomitant search for highly efficient in vitro mutagenesis methods haspromoted the adaptation of several in vitro transposon systems as toolsto study genomes. An in vivo reaction for the bacterial transposon Tn3has been used to efficiently sequence plasmid inserts of variablelengths; however, only approximately 37% of the nucleotides were foundto be capable of serving as sites for insertion (Davies, 1995 #419). Asimilar, more random system has been developed for yeast retrotransposonTy1, employing synthetic transposons with U3 ends as substrates and Ty1virus-like particles supplying transposition functions (28) to sequenceplasmids with yeast and human DNA inserts. A disadvantage to this methodis the requirement for the cumbersome preparation of VLPs. In vitrotransposition with an MLV integrase system has been utilized as a toolto dissect some of the mysteries of chromatin packaging (29) (30) (31)and as a tool for functional genetic footprinting (32). However, the MLVinsertions do not appear to be completely random. An object of theinvention therefore is to provide a transposon and transpositionreaction with more random target site specificity. Therefore, theinventor examined the target site selectivity of the TnsC^(A225V)machinery in vitro and explored the viability of this reaction as aneffective tool for random insertional mutagenesis.

BRIEF SUMMARY OF THE INVENTION

[0031] Accordingly, a general object of the invention is to provide atransposable system that achieves efficient, simple, non-specific orrandom insertion into any given DNA segment.

[0032] A further object of the invention is to provide a transposablesystem that achieves efficient random insertional mutagenesis via simpleinsertion.

[0033] Therefore, a specific object of the invention is to provide atransposable system that achieves efficient target site specificity thatis reduced from wild-type and preferably random, via simple insertion.

[0034] A more particular object of the invention is to provide atransposon containing a mutation in a transposon-derived protein thatallows efficient, simple insertion and target site selectivity that isreduced from the wild-type, and preferably random.

[0035] A more particular object of the invention is to provide atransposable system with a mutation in a transposon-derivedATP-utilizing regulatory protein. The mutation allows the efficient,simple, non-specific or random insertion of the transposable elementinto a DNA segment or at least provides reduced target site specificityfrom the wild-type.

[0036] A preferred object of the invention is to provide a Tn7transposable system that achieves simple, efficient, non-specific orrandom insertion into a given DNA segment, or at least reduced targetsite specificity compared to the wild-type Tn7.

[0037] A preferred object of the invention is to provide a mutation inthe Tn7 transposon that confers efficient, simple, non-specificinsertion into a given DNA segment, or at least reduced target sitespecificity compared to the wild-type Tn7.

[0038] A preferred object of the invention is to provide a Tn7transposable system with a mutation in the TnsC protein encoded in theTn7 transposon, which mutation allows efficient, simple insertion withreduced target site specificity compared to the wild-type, andpreferably allows non-specific insertion into a DNA segment.

[0039] Objects of the invention include methods for using the abovecompositions.

[0040] Accordingly, a general object of the invention is to provide amethod for efficient, simple, random insertion of a transposable elementinto a given DNA segment.

[0041] A further object of the invention is to provide a method forefficient, simple, random insertional mutagenesis by a transposableelement.

[0042] A specific object of the invention is to provide a method forefficient, simple, random transposition of a transposable element into aDNA segment, or in which the specificity of transposition is reducedcompared to wild-type.

[0043] A more particular object of the invention is to provide a methodfor efficient, simple, random transposition of a transposable elementinto a DNA segment in which the specificity of transposition is reducedcompared to the wild-type by using a transposable system containing amutation that confers efficient, simple insertion with reduced targetsite specificity compared to the wild-type, and preferably randominsertion.

[0044] A more particular object of the invention is to provide a methodfor efficient, simple, random transposition of a transposable elementinto a DNA segment or in which the specificity of transposition isreduced compared to wild-type, by using a transposable system with amutation in an ATP-utilizing regulatory protein, the mutation allowingthe efficient, simple, non-specific insertion of the transposableelement into a DNA segment or at least providing for reduced target sitespecificity compared to the wild-type.

[0045] A preferred object of the invention is to provide a method forefficient, simple transposition of a transposable element into a DNAsegment in which the specificity of transposition is reduced compared towild-type, or is preferably random, by providing a Tn7 transposablesystem that is capable of non-specific insertion into a DNA segment, orat least reduced target site specificity compared to the wild-type Tn7.

[0046] A further object of the invention is to provide a method forefficient, simple transposition of a transposable element transposoninto a DNA segment in which specificity of transposition is reducedcompared to wild-type or is preferably random by providing a Tn7mutation that confers the efficiency, ability to make a simpleinsertion, and the randomness or reduced specificity.

[0047] A further object of the invention is to provide a method forefficient, simple, random transposition of a transposable element into aDNA segment, or in which the specificity of transposition is reducedcompared to the wild-type, by providing a mutation in the TnsC proteinencoded in the Tn7 transposon, the mutation allowing a reduction intarget site specificity compared to the wild-type and preferablyallowing non-specific or random insertion of the Tn7 transposableelement into a DNA segment.

[0048] A further object of the invention is to provide a method for DNAsequencing using a transposable system to introduce priming sites atrandomly-distributed locations within a fragment of interest where thefragment is larger than the sequence run length.

[0049] A preferred object of the invention is to provide a method forDNA sequencing using a transposable system with a mutation that allowsefficient and simple insertion and target site selectivity that isreduced from the wild-type and preferably random.

[0050] A preferred object of the invention is to provide a mutation inan ATP-utilizing regulatory protein. The mutation allows the efficient,simple, non-specific insertion of the transposon into a DNA segment orat least provides reduced target site specificity over wild-type.

[0051] A highly preferred object of the invention is to provide a methodfor DNA sequencing using a Tn7 transposable system that allowsefficient, simple, non-specific insertion into a DNA segment or at leastreduced target site specificity compared to the wild-type Tn7.

[0052] A highly preferred object of the invention is to provide a methodfor DNA sequencing using a Tn7 transposable system with a mutation inthe TnsC protein, the mutation allowing efficient, simple insertion anda reduction in target site specificity compared to the wild-type andpreferably allowing non-specific or random insertion of the Tn7transposable element into the DNA segment.

[0053] A further object of the invention is to provide methods asdescribed above that can be applied to any given DNA segment. Theseinclude, but are not limited to, plasmids, cellular genomes, includingprokaryotic and eukaryotic, bacterial artificial chromosomes, yeastartificial chromosomes, and mammalian artificial chromosomes, andsubsegments of any of these.

[0054] An object of the invention is to provide these methods in vitroor in vivo.

[0055] A further object of the invention is to provide kits for carryingout the above-described methods using the above-described transposons orparts thereof.

[0056] The inventor has accordingly developed a transposable system andmethods that improve on in vitro and in vivo transmission methodspreviously described in that the methods are efficient fortransposition, provide relatively random insertion, and almost allproducts recovered are simple insertions at a single site which thusprovide useful information.

[0057] In a general embodiment of the invention, the invention isdirected to a transposable system that achieves simple, efficient,random insertion into a given DNA segment.

[0058] In a further embodiment of the invention, the invention isdirected to a transposable system that is capable of efficient randominsertional mutagenesis, preferably by means of a simple insertion.

[0059] In a specific embodiment of the invention, the invention isdirected to a transposable system with target site specificity that isreduced from the wild-type and preferably random, which allows simpleand efficient insertion.

[0060] In a further specific embodiment of the invention, the inventionis directed to a transposable system containing a mutation that allowstarget site specificity that is reduced from the wild-type and ispreferably random.

[0061] In a preferred embodiment of the invention, the invention isdirected to a transposable system with a mutation in an ATP-utilizingregulatory protein, the mutation allowing the efficient, simple,non-specific insertion of the transposon into a DNA segment or at leastproviding reduced target site specificity from the wild-type.

[0062] In a highly preferred embodiment of the invention, the inventionis directed to a Tn7 transposable system that achieves efficient,simple, non-specific insertion into a given DNA segment, or at leastreduced target site specificity compared to the wild-type Tn7.

[0063] In a highly preferred embodiment of the invention, the inventionis directed to a mutation in a Tn7 transposon that confers thecapability of efficient, simple, non-specific insertion into a DNAsegment, or at least reduced target site specificity compared to thewild-type Tn7.

[0064] In a highly preferred embodiment of the invention, the inventionis directed to a mutation in the TnsC protein encoded in the Tn7transposon, the mutation allowing simple, efficient insertion and areduction in target site specificity compared to the wild-type andpreferably allowing non-specific or random insertion of the Tn7transposition into a DNA segment.

[0065] In a specific disclosed embodiment of the invention, theinvention is directed to a Tn7 mutant designated TnsC^(A225V), which isa mutant having an alanine to valine substitution at amino acid number225 in the TnsC gene.

[0066] The invention also embodies methods for using all of the abovecompositions. Methods are directed to transposition or insertion of thetransposable elements described above.

[0067] Accordingly, in one embodiment, the invention provides generallyfor efficient, simple, random insertion of a transposon into a given DNAsegment, or at least insertion with reduced specificity compared to thewild-type.

[0068] In a further embodiment of the invention, the invention isdirected to methods for insertional mutagenesis using a transposablesystem that is capable of efficient, simple, random insertion or atleast insertion with reduced specificity compared to wild-type.

[0069] In a further embodiment of the invention, the invention isdirected to methods for insertion of a transposable element into a DNAsegment in which target site specificity is reduced from wild-type andis preferably random, where insertion is efficient and simple.

[0070] In a further embodiment of the invention, the invention isdirected to methods for insertion of a transposable element into a DNAsegment, by providing a transposable element containing a mutation thatallows efficient and simple insertion and target site specificity thatis reduced from the wild-type and is preferably random.

[0071] In a preferred embodiment of the invention, the invention isdirected to methods for inserting a transposable element into a DNAsegment by providing a transposable system with a mutation in anATP-utilizing regulatory protein, the mutation allowing simple,efficient, and non-specific insertion of the transposon into a DNAsegment, or at least providing reduced target site specificity from thewild-type.

[0072] In a highly preferred embodiment of the invention, the inventionis directed to methods for inserting a transposable element into a DNAsegment by providing a Tn7 transposable system allowing efficient,simple, non-specific insertion into a given DNA segment or at leastreduced target site specificity compared to the wild-type Tn7.

[0073] In a highly preferred embodiment of the invention, the inventionis directed to a Tn7 transposable system with a mutation that allowssimple, efficient, and non-specific insertion of a transposable elementinto a DNA segment or at least provides reduced target site specificityfrom the wild-type Tn7.

[0074] In a highly preferred embodiment of the invention, the inventionis directed to methods for inserting a transposable element into a DNAsegment by providing a Tn7 transposable system with a mutation in theTnsC protein, the mutation allowing efficient and simple insertion and areduction in target site specificity compared to the wild-type andpreferably allowing non-specific or random insertion of the Tn7transposition into a DNA segment.

[0075] In a specific disclosed embodiment of the invention, theinvention is directed to methods for inserting a transposable elementinto a DNA segment, by providing the Tn7 mutant TnsC^(A225V.)

[0076] The invention also provides kits for performing theabove-described methods and the methods further described herein. In apreferred embodiment, a kit is supplied whose components comprise amutant ATP-utilizing regulatory protein derived from a transposon, themutation allowing efficient, simple, non-specific insertion of thetransposon into a given DNA segment. The kit also provides atransposable element which can be found as part of a larger DNA segment;for example, a donor plasmid. The kit can further comprise a buffercompatible with insertion of the transposable element. The kit canfurther comprise a control target sequence, such as a control targetplasmid, for determining that all of the ingredients are functioningproperly. For DNA sequencing, the kit can further comprise sequencingextension primers with homology to one or more sites in the transposableelement. Primers can have homology to sequences outside the transposableelement (i.e. in a target vehicle).

[0077] In the kits, the mutant protein may be added as a purifiedprotein product, may be encoded in the transposable element and producedtherefrom, or encoded on vectors separate from the transposable segment,to be produced in vivo.

[0078] It is to be understood that the invention encompassestransposable systems with varying degrees of reduction of target sitespecificity from the wild-type which are useful for the purposes of theinvention described herein.

BRIEF DESCRIPTION OF THE FIGURES

[0079]FIG. 1. Papillation phenotypes of the TnsC gain-of-functionmutants. Cells were patched on MacConkey lactose plates and photographedafter three days' incubation at 30° C. TnsA+B was present in eachstrain; the TnsC species present is indicated below each patch.

[0080]FIG. 2. Amino acid changes in the TnsC mutants. The TnsC proteinsequence (SEQ ID NOS.1 and 2) is cartooned, with the residues altered inthe Class I mutants indicated above the protein and the Class II mutantsbelow the protein. Hatched boxes represent Walker A and Walker B motifs.

[0081]FIG. 3. TnsC mutants promote transposition to the chromosome.Frequencies of R transposition of miniTn7-Km^(R) from a λ phage to thechromosome were measured by the λ hop assay. TnsA+B was present in eachstrain; the TnsC species present is indicated below each column.

[0082]FIG. 4. TnsC mutants promote transposition to conjugable plasmids.Frequencies of transposition of miniTn7-Km^(R) from the chromosome tothe conjugable target plasmid pOX-G were measured by the mating-outassay. TnsA+B was present in each strain; the TnsC species present isindicated below each column.

[0083]FIG. 5. The substrates, intermediates and products of Tn7transposition. One substrate is a donor plasmid containing a miniTn7element which contains the essential cis-acting sequences at each endfor transposition. The other substrate is a target plasmid.Transposition initiates with a double strand break at either end of theelement, followed by a second break at the other end to generate anexcised linear transposon. This excised transposon is then joined to thetarget DNA to form a simple insertion.

[0084]FIG. 6. Analysis of Tn7 transposition reactions on a agarose gel.The donor plasmid, a pBR derivative, contained a miniTn7 elementcontaining a kanamycin gene and the target plasmid contained an attTn7site. Recombination reactions were carried out as described, the DNAsisolated from the reaction mixture by phenol extraction, digested with arestriction enzyme that cuts once in the donor backbones, displayed byelectrophoresis on an agarose gel, transferred to a membrane byelectrotransfer and hybridized with a probe specific for the miniTn7element. Lane 1: TnsA+B; Lane 2: TnsA+B+Cwt; Lane 3: TnsA+B+CE233K; Lane4: TnsA+B+CS^(401YΔ402), Lane 5: Tns(A+B)+C^(A225V).

[0085]FIG. 7. Tn7 insertion mediated by TnsA+B+C^(A225V) occurs at manydifferent sites in a target DNA. In vitro transposition reactions usingTnsA+B+C^(A225V) were carried out and the DNAs isolated by phenolextraction and ethanol precipitation. A PCR reaction using thetransposition products as a template was then carried out in which oneprimer (NLC 209) (SEQ ID NO:8) complementary to a sequence on the targetDNA and another primer NLC 95 (SEQ ID NO:7) complementary to the leftend of Tn7. The length of the PCR products will vary depending on theposition of the Tn7 insertion, for example, insertions being closer tothe target primer will be short (insert 1) and those more distant willbe longer (insert 2). The products of the in vitro reaction were thendisplayed on a denaturing acrylamide gel by electrophoresis, transferredfrom the gel to membranes and analyzed by hybridization to aradioactively labeled probe that hybridizes to Tn7 sequences on one endof the transposon.

[0086]FIG. 8. Analysis of distribution of insertions in differentregions of the plasmid. Tn7 displays little target site selectivity atmany regions of a target. In vitro transposition reactions were carriedout and the products used as a template for PCR reactions as describedabove except for the target primer. In these experiments, one primer inthe end of Tn7 (NLC 95) (SEQ ID NO:7) was used and in separate reactionsprimers from several different positions in the target DNA were used.

[0087] FIGS. 9A-C. Structure of Tn7 donor plasmids. A. A plasmidcontains a miniTn7 element in which the essential cis-acting sequencesat the element termini flank a selectable marker. The translocation ofthe element can be readily followed by hybridization to a miniTnspecific probe. Many different kinds of information could be inside theends as a selectable (or identifable marker, for example, an antibioticresistance gene. If the products of transformation are to be recoveredin vivo, it is convenient to remove unreacted donor DNA by digestionwith a restriction enzyme that is selective for the donor backbone;alternatively a conditional replicon can be used. B. Sequence of Donorplasmid pEM delta R.adj to 1 (SEQ ID NO:3). Plasmid carries a 1625 bpmini-Tn7 element: 199 bp of Tn7R and 166 bp of Tn7L flank a Kan genewith SalI sites at the junctions. The backbone is pTRC99 (Pharmacia);mini-Tn7 plus flanking host DNA was cloned into the SmaI site. C. Acommonly used derivative is pEM-Δ, (SEQ ID NO:4) a pBR plasmidcontaining a kanamycin mTn7 element.

[0088] FIGS. 10A-B. Tn7 target plasmids. A. Sequence of Target plasmidpER 183 (SEQ ID NO:5). This 8.9 kb pACYC184 derivative carrieschloramphenicol resistance, a p15A origin of replication, and insertscarrying mcrB, mcrC, hsdS, and a segment of phage fl. A large target wasused to detect preference of moderate complexity (up to four bppreferences should be detectable). In addition, different segments ofthe plasmid vary in G+C content from 35% to 68%, so that any preferencethe transposition system might display for a particular G+C contentmight be revealed. B. The major targets used in this work are pRM2 (SEQID NO:6), a 3190 bp pBR derivative containing at attTn7 segment andpER183 (SEQ ID NO:5), a pACYC derivative containing several E. coligenes.

[0089]FIG. 11. Diagram of sequencing runs used to ascertain thepositions of 63 insertions of mini-Tn7 into pER183 (SEQ ID NO:5).Numbers at the top refer to coordinates on the sequence of pER183 (SEQID NO:5) displayed in FIG. 11B. Arrows indicate the direction of primerextension; arrow stems cover the sequence obtained from the run.Arbitrary numbers attached to the arrows assigned by the sequenceassembly program AUTOASSEMBLE.

[0090]FIG. 12. Graph of the observed distribution of insertions in100-bp intervals of pER183 (SEQ ID NO:5) and the distribution expectedif the distribution were random. On the abcissa is the number ofinsertions per interval; on the ordinate is the number of intervals thatexhibit that number of insertions. Crosses show the expected values fora random (Poisson) distribution of insertions along the sequence;diamonds show the observed values.

[0091]FIG. 13. The base composition of the 5 bp sequences duplicated bythe process of Tn7 insertion for the 63 sites examined. On the abcissa,sequence positions are numbered relative to the right end of Tn7 (Tn7R)such that position 1 is immediately adjacent, position 5 is 5 bp away(see diagram below the graph). On the ordinate is the number ofinstances of a particular base at that position. All bases are wellrepresented at all sites.

[0092]FIG. 14. Effect of four methods of stopping the transpositionreaction in preparation for introduction into cells. Results for fourreplicates (abcissa) of each of four stop methods (z axis), reported asnumber of transformants per {fraction (1/50)}th of the total reaction(ordinate). Treatments were: no treatment; heat treatment at 65° C. for20 min; heat treatment at 75° C. for 10 min; and phenol extractionfollowed by ethanol precipitation. Heat treatment at 75° C. but not 65°C. allows effective recovery.

[0093]FIG. 15. A second experiment displaying the effect of threemethods of stopping the transposition reaction in preparation forintroduction into cells. Results are shown for two replicates of each ofthree stop methods (abcissa) for four doses of two different aliquots ofTnsB (z axis), reported as number of transformants per {fraction(1/25)}th reaction. Treatments were: heat treatment at 75° C. for 10min; ethanol precipitation alone; or heat treatment at 65° C. for min.On the z axis, two aliquots (1- or 2-) of TnsB were used, in fourdifferent doses, 1 μl 1.5 μl 2 μl or 3 μl. The row labeled 1-2, forexample, employed aliquot 1 and used 2 μl of it. Heat treatment at 75°C. but not 65° C. allows effective recovery. This experiment alsoillustrates the dose-response to TnsB.

[0094]FIG. 16. Effect of two methods of storing proteins on theefficiency of the transposition reaction. Abcissa displays the storageconditions tested: “individually”, TnsA, TnsB and TnsC proteins storedindividually in separate tubes at −70° C.; “as a mixture (A2a)”, TnsA,TnsB and TnsC proteins stored together as a mixture at −70° C. Ordinatedisplays the number of transformants per {fraction (1/50)}th of thetotal reaction. Each treatment was tested in quadruplicate.

[0095]FIG. 17. Effect of three methods of storing proteins on theefficiency of the transposition reaction. Abcissa displays the storageconditions tested: “individually”, TnsA, TnsB and TnsC proteins storedindividually in separate tubes at −70° C.; “as a mixture, −70 (A2a)”,TnsA, TnsB, and TnsC proteins stored together as a mixture at −70° C.“as a mixture, −20 (A2b)”, TnsA, TnsB and TnsC proteins stored togetheras a mixture at −20° C. Ordinate displays the number of transformantsper {fraction (1/50)}th of the total reaction. Each treatment was testedin quadruplicate.

[0096] FIGS. 18A-18B. 18A. nucleotide sequence of TnsC (SEQ ID NO:1).18B. Amino acid sequence of TnsC (SEQ ID NOS:1 and 2).

DETAILED DESCRIPTION OF THE INVENTION

[0097] In the art, the term “transposon” encompasses a segment flankedby particular cis-acting sites that are required for mobilization tooccur, together with the genes that specify the proteins that act onthose cis-acting sites to mobilize the segment defined by them, whetheror not the protein-encoding genes lie between the sites mentioned. Forexample, according to the present invention, a Tn7 transposon cancorrespond to the wild type transposon except that the transposonencodes a mutant TnsC. This transposon thus provides the proteinproducts required for mobilization. However, an entire transposon is notnecessary to practice the invention. Thus, the term “transposonderivative”, “transposable element”, or “insertable element” as usedherein can also refer to DNA minimally comprising the cis-acting sitesat which the trans-acting proteins act to mobilize the segment definedby the sites. It is also understood that the sites may containintervening DNA.

[0098] The phrase “transposable system” as used herein encompasses atransposon containing a mutation in a native ATP-utilizing regulatoryprotein which, when expressed from the transposon, allows for thenon-specific target site selectivity or reduced target site selectivitydisclosed herein. The phrase also encompasses modifications in which therelevant proteins are not encoded on the transposable element butnevertheless, acts upon it to achieve the objects of the invention.Thus, the system encompasses compositions in which the mutant protein isadded to a transposable element that is derived from a transposon butwhere the element contains less than the full complement of genes. Theonly limitation on this element is that it contain the cis-actingsequences upon which the mutant protein acts that allows integration ofthe element into a target DNA. Thus, the system comprises DNA withcis-acting sites (which may contain heterologous DNA sequences) and thetrans-acting proteins that employ those sites to mobilize the segmentdefined by the sites, regardless of how they are organized in DNA.Accordingly, the proteins may be provided in separate plasmids or inpurified form.

[0099] The term “transposon-derived” as used herein to refer to themutant protein, refers to a derivative of a protein normally found onthe transposon. However, this need not be the naturally occurringprotein but can be the protein produced by recombinant or chemicalsynthetic methods known to those in the art.

[0100] The term “transposable element” encompasses both transposons andderivatives thereof. The only limitation on the derivative is that it iscapable of integrating into DNA, containing cis-acting sequences thatinteract with transacting proteins to effect integration of the element.

[0101] The invention provides a transposable system that allows simpleintegration of a transposable element into a given DNA targetefficiently and with a relatively low degree of specificity, preferablyrandom specificity. By “relatively” is intended the degree ofspecificity compared to the wild-type.

[0102] The efficiency of integration can vary depending upon theparticular use for which insertion is desired. The mutations describedherein increase the efficiency of integration compared to the wild-typefrequency. The invention encompasses an efficiency of one simpleintegration event per every 5-10 kilobases. Preferred levels ofintegration allow multiple simple insertions in different positions inevery gene.

[0103] Integration is also effected by the degree of specificity thatthe mutation confers or allows. Thus, specificity relates to therelationship of a target DNA sequence and the transposable system.

[0104] A preferred degree of specificity results in an average insertionin every gene. A practical lower limit would be, on average, oneinsertion per twenty genes.

[0105] For sequencing, greater than or equal to 90% of the insertionsscreened are at different locations (i.e. 10 insertions hit at least 9different sites) so that almost every template examined gives newinformation. This is true in DNAs of a variety of different basecompositions since possible target DNAs may vary between 20% and 80%G+C. Another way to describe the possible randomness of the system is tosay that of 63 insertions, 62 insertion sites were found (around 98% ofinsertions are at different locations).

[0106] For mutagenesis, non-commercial systems have been widely usedthat yield as little as 10% of insertions at different sites (i.e. 9 of10 insertions are at the same site). The present invention improves onthis level of randomness.

[0107] Furthermore, the types of insertions that are relevant to thediscussion of frequency are simple insertions.

[0108] The invention provides a transposable system with a mutation thatprovides for efficient, simple insertion and reduced or random targetsite specificity.

[0109] The term “simple insertion” refers to a single copy integrationevent of the element introduced into the target by double-strandbreakage and rejoining.

[0110] Although simple insertions (only one copy of the integrant) arepreferred, there may be certain embodiments in which more than one copydoes not interfere with the purpose of the application, for example someapplications of in vitro mutagenesis, or is actually desirable (forexample, for multiple copies of a heterologous DNA sequence are to beinserted). Accordingly, the invention is not limited to the case inwhich the transposable system provides for simple insertion only.

[0111] In a preferred embodiment of the invention the mutation is in atransposon-derived ATP-utilizing regulatory protein. One can recognizesuch a protein by its similarity to the TnsC protein of Tn7, that is byits sequence homology, its possession of a protein sequence motifelement similar to an ATP binding site motif in other ATP-dependentproteins, or by reconstitution of an in vitro transposition system anddemonstration of a requirement for nucleotides in that in vitrotransposition system.

[0112] In a highly preferred embodiment, the mutation is in the TnsCgene (SEQ ID NO:1) encoding the TnsC protein (SEQ ID NOS:1 and 2) ofTn7. This mutation provides a Tn7 transposon that is capable ofrelatively non-specific insertion into a given DNA segment.

[0113] Thus, the invention is directed to insertion of the Tn7transposable element but is not limited to this transposable element.Accordingly, the invention can be practiced with transposable elementsrelated to Tn7 in that transposition occurs by means of an ATP-mediatedprocess. Thus, mutations in the ATP-utilizing proteins in suchtransposons is contemplated in this disclosure. Accordingly, transposonswith ATP-utilizing regulatory proteins in addition to Tn7 areencompassed in the invention. Examples of such transposons areTn5090/Tn420; the transposon-encoded transposition proteins are TniA,TniB, and TniQ. The TniB would be the ATP-utilizing protein.

[0114] Another class of transposon is encompassed by the invention inwhich it is possible to increase the frequency by altering theATP-utilizing proteins examples are Tn552 and IS21.

[0115] The invention provides for the insertion of the transposableelements described herein into any DNA segment of any organism.Moreover, the invention also provides for the insertion into anysynthetic DNA segment.

[0116] Insertion of the transposable element can be in vivo. In thiscase, the transposable element is introduced into a desired host cell,where it inserts directly into DNA in that cell. The only limitation isthat the transposable element be capable of insertion in the specifichost cell DNA. Thus, as long as the proteins required for transpositioncan be expressed in a desired cell, this cell can provide a host forinsertion of the transposable element into any DNA found in that hostcell.

[0117] Insertion can be for the purpose of gene inactivation. Geneinactivation is useful for genetic analysis (e.g. gene function).

[0118] Genetic analysis includes:

[0119] assessment of the phenotype of a null allele (not expressingfunctional protein due to interruption of the gene by the transposablesegment); assessment of the consequences of insertion of particularactive DNA structures or sequences for genetic properties of chromosomesor their parts, such as but not limited to accessibility to Dnase I orto footprinting reagents, or expression or silencing of nearbytranscribable genes, or for activity of genetic or epigenetic processessuch as, but not limited to homologous recombination, chemicalmutagenesis, oxidative DNA damages, DNA methylation, insertion ofproviruses or retroposons; assessment of protein domain structure viacreation of multiple interruption points within a gene for a multidomainprotein, wherein a gene product missing one or more domains of themultidomain protein might exhibit partial activity or activities,including antigenic activities or immunodominant epitopes [randomness isparamount here, many insertion positions are needed if borders are to bedefined accurately]; assessment of expression pattern via creation oftranscriptional fusions of a promoter in the target to a reporter (e.g.beta galactosidase or green flourescent protein or chloramphenicoltransacetylase or luciferase) within the transposable segment;assessment of expression pattern via creation of translational fusionsof a portion of a gene product encoded by a target to a gene product oran antigenic peptide encoded by the transposable segment (e.g. betagalactosidase or an epitope tag or an affinity tag); assessment ofoperon structure, in which interruption of transcription by insertionupstream of a gene results in altered expression of a gene withoutdisrupting the coding sequence of that gene; gratuitous expression of agene, in which transcription from a promoter within the transposablesegment results in expression of a gene downstream of the position ofinsertion of the transposable segment, with or without regulation oftranscription of the promoter within the transposable segment;gratuitous expression of a protein fusion, in which transcription from apromoter within the transposable segment results in translation of aprotein beginning within the transposable segment and proceeding towardthe outside of the transposon, then continuing into the gene withinwhich the transposable segment is inserted, resulting in a fusion of thetransposon-encoded protein with the target protein; assessment of theconsequences of introducing into the host cell any transcript or geneproduct entirely encoded within the transposable segment, especiallywhere it is desirable to assess position-effects (the consequences notonly of expression but of expression in different positions within thegenome).

[0120] Insertion can also be for the purpose of introducing heterologousDNA sequences into the DNA of a host cell. The DNA in the host cell inwhich the insertion occurs can be the host genomic DNA orextrachromosomal elements. This includes both naturally-occurringelements and elements introduced exogenously.

[0121] Heterologous genes that can be introduced via the insertioninclude reporter genes. DNA sequences can also be introduced thatprovide physical markers in a chromosome. Insertion can also be used asa simple way to recover the host DNA that is flanking the insertedelement. Genomic DNA is cut with restriction enzymes and the insertionplus the flanking DNA is then cloned.

[0122] Another utility or another application of the invention is toanalyze the interaction of various non-transposition proteins with a DNAsequence, for example, DNase footprinting of repressors bound to DNA. Afurther use is to study the structure of genomic chromatin i.e., thestate at which DNA is actually found in the cell.

[0123] A further advantage in using Tn7 and similar transposons is thatof double end or “concerted” joining. Accordingly, Tn7 inserts in a “cutand paste” manner with both ends of the transposon being joined to thetarget DNA.

[0124] Insertion can also be in vitro. In vitro insertion provides anadvantage over insertion in vivo. Using in vitro insertion, thetransposable element can be placed in any DNA target and that targetthen introduced into a host cell where it can integrate or replicate.Accordingly, this greatly expands the host cell range.

[0125] Targets for insertion, accordingly, include DNA fragments,plasmids and other extrachromosomal elements capable of replication inprokaryotic and/or eukaryotic host cells. Given the array of plasmidsavailable, potentially any cell can be used as a host for an insertiontarget containing a transposable element that was introduced into thetarget in vitro. The target can be based on a bacterial plasmid,bacteriophage, plant virus, retrovirus, DNA virus, autonomouslyreplicating extra chromosomal DNA element, linear plasmid, mitochondrialor other organelle DNA, chromosomal DNA, and the like.

[0126] When introduced into the host cell, the target can be maintainedas an autonomously replicating sequence or extrachromosomal element orcan be integrated into host DNA. When integrated, integration can occurby homologous recombination or by means of specific integrationsequences such as those derived from retroviruses, DNA viruses, and thelike.

[0127] It may be, but is not necessarily, desirable to obtainreplication of the target in the host cell. A specific application inwhich this is desirable is the case in which a transposable element isused as a component for introducing primer binding sites for DNAsequencing.

[0128] Accordingly, in a highly preferred embodiment of the invention, atransposable element is introduced into a target containing a DNAsegment for which a sequence is desired. This target is then introducedinto a host cell where it is allowed to replicate, thus producingsufficient copies to allow DNA sequencing using a primer specificallyrecognizing a sequence in the target.

[0129] In one embodiment of this method, the primers recognize one orboth ends of the transposable element such that sequencing can proceedbidirectionally from the transposable element insertion site into thesurrounding DNA. The target may be composed entirely of DNA segments forwhich the sequence is required or may simply contain subsequences forwhich a sequence is required. In this aspect the only limitation on thetarget is that it is able to replicate in the host cell (and thereforecontains sequences that allow this to occur).

[0130] It is also highly desirable that the target have a selectionmarker in order to eliminate the background in host cells containing thetarget without the insertion of the transposon.

[0131] An alternative way to eliminate this background, however, is toprovide a method for disabling a target that has not received aninsertion so that it is unable to replicate in the host cell and is thusdiluted out during host cell culture. Accordingly, the transposableelement itself could contain an origin of replication for the host cell.Thus targets not receiving an insertion would be unable to replicate. Aninsertion could also result in the formation of functional replicationsequences. The target could also contain a heterologous conditionalorigin, such as the R6K origin, that cannot replicate without the pirprotein. The person of ordinary skill in these arts would be aware ofthe various methods for constructing targets with the (in)ability toreplicate in a specific host cell.

[0132] It is also possible, however, to use the transposable elementsdescribed herein for DNA sequencing without the in vitro insertiondescribed above. Insertion could be directly accomplished in host cellDNA and then the DNA containing the insertion removed from the host.This DNA segment could then be replicated although it does notnecessarily have to be if the host has produced sufficient copies forsequencing. Accordingly, sufficient numbers of the segment with theinsertion sequence could then be sequenced as above.

[0133] An example of the case in which the DNA segment receiving an invivo insertion would not need to be further replicated in another hostis, for example, a case in which the insertion occurs in a sequencecapable of being amplified directly in the host cell. This could be aplasmid containing an amplifiable marker, such as the dhfr gene, thecell being grown in a selective medium containing methotrexate. Theperson of ordinary skill in the art would know the various methods foramplifying DNA segments using selectable markers. The selectable markercould be introduced on the transposon but would not necessarily need tobe.

[0134] In a further DNA sequencing protocol, the primers that are usedfacilitate DNA segment amplification by the PCR reaction. For example, aprimer can be used that recognizes an end of the transposable elementwith the second primer being found in the target DNA sequence. Theprimer could be based on random sequences or on known sequencesdeliberately placed in the target vehicle. Thus the target vehicle couldcontain a characterized plasmid (as an example) in which the sequencesare known. In this instance, primers can be designed to hybridize to anyarea within the plasmid, the segment to be sequenced being between thetransposon and the second primer site in the target vehicle.

[0135] In accordance with the above-described embodiment, the inventionis also directed to kits for performing transposable element insertionin vitro. As described, such insertions can be used to provide primingsites for DNA sequence determination or to provide mutations suitablefor genetic analysis or both.

[0136] Essential components in the kit are gene products allowingtransposition that are normally encoded on the transposable element ortheir functional equivalents. A further component is a transposableelement donor vehicle. This nucleic acid vehicle provides thetransposable element to be inserted into a given specific target. Thetransposable element donor is preferably DNA but could encompass RNA,being operable via a cDNA copy. Preferred DNA vehicles include, but arenot limited to, bacterial plasmids. Other vehicles include any DNA thatcan be isolated in super coiled form or placed into a super coiledconfiguration by the use of topoisomerases, for example, bacteriophagedDNA, autonomously replicating molecules from eukaryotes or archae, orsynthetic DNA that can be ligated to form a topologically closed circle.

[0137] Optional components of the kit include one or more of thefollowing: (1) buffer constituents, (2) control target plasmid, (3)sequencing primers. The buffer can include any buffer suitable forallowing the transposition activity to occur in vitro. A preferredembodiment is HEPES buffer. A specific disclosed embodiment is includedin the exemplary material herein.

[0138] Preferred donor plasmids do not need to be destroyed beforeintroducing transposition products into commonly used bacterial andpreferably E. coli strains. These vectors do not replicate withoutregulatory genes not provided by the host cell which allow a functionalreplication origin. An example is the pir gene which is present only inspecially constructed strains, having been derived from the plasmid R6K.In this way, artifactual background consisting of cells transformed withboth the donor DNA and the target DNA without any transposition havingoccurred is eliminated. As discussed herein, there are other ways to dothis such as restriction digestion of the donor DNA but not of thetarget or transposable segment or deletion and titration of thetransposition reaction so that there are more cells than DNA moleculesin the transformation step. However, these are not preferred.

[0139] The control target plasmid does not contain the transposableelement and does contain transposable element integration site. Thepurpose is to assure that the reaction is not inhibited by a contaminantin non-kit ingredients (introduced by the kit user); i.e. it ensuresthat all components allow optimal insertion.

[0140] Sequencing primers include, but are not limited to primers thathave homology with both ends of the transposable element and, as such,allow sequencing to proceed bidirectionally from the ends of thetransposable element. However, primers could be made to any area withinthe transposable element or within the target vehicle itself as long asextension is allowed into the DNA segment to be sequenced. Kits designedfor allowing sequencing by the PCR reaction may also include a secondprimer that allows the amplification of the sequence between the firstand second primers.

[0141] The control target plasmid preferably contains a selectablemarker for recovery of the desired DNA segment from a specific hostcell. It is understood that, when using the kit, the target DNA does notcarry the same selectable marker as the control target nucleic acid.

[0142] A fourth optional component of a kit is target DNA itself. TargetDNA that might be desirable would include but is not limited to purifiedchromosomal DNA, total cDNA, cDNA fractionated according to tissue orexpression state (e.g. after heat shock or after cytokine treatmentother other treatment) or expression time (after any such treatment) ordevelopmental stage, or plasmid, cosmid, BAC, YAC or phage library ofany of the foregoing DNA samples, especially such target DNA fromimportant study organisms such as Homo sapiens, Mus domesticus, Musspretus, Canis domesticus, Bos, Caenorhabditis elegans, Plasmodiumfalciparum, Plasmodium vivax, Onchocerca volvulus, Brugia malayi,Dirofilaria immitis, Leishmania, Zea maize, Arabidopsis thaliana,Glycine max, Drosophila melanogaster, Saccharomyces cerevisiae,Schizosaccharomyces pombe, Neurospora, Escherichia coli, Salmonellatyphimurium, Bacillus subtilis, Neisseria gonorrhoeae, Staphylococcusaureus, Streptococcus pneumonia, Mycobacterium tuberculosis, Aquifex,Thermus aquaticus, Pyrococcus furiosus, Thermus littoralis,Methanobacterium thermoautotrophicum, Sulfolobus caldoaceticus, andothers.

[0143] Other suitable selectable markers include chloramphenicolresistance, tetracycline resistance, spectinomycin resistance,streptomycin resistance, erythromycin resistance, rifampicin resistance,bleomycin resistance, thermally adapted kanamycin resistance, gentamycinresistance, hygromycin resistance, trimethoprim resistance,dihydrofolate reductase (DHFR), GPT; the URA3, HIS4, LEU2, and TRP1genes of S. cerevisiae.

[0144] There may be certain instances in which it is desired tointroduce primer binding sites other than those naturally found in thetransposable element or in the insertion vehicle. In this case, thetransposable element can be used as a vehicle for introducing anydesired primer or primers. An example of when the use of exogenousprimers may be desirable is the case in which the transposable elementends form a secondary structure that interferes with sequencing, orcases in which there is a similarity of sequence between the two ends ofthe transposable element, and cases in which the only practical bindingsites in the transposable element are so far internal that theyundesirably curtail the amount of nucleotides that can be sequenced fromthat site.

[0145] The invention also generally encompasses compositions containingan ATP-dependent DNA binding protein encoded by a transposon, theprotein containing a mutation conferring reduced target sitespecificity, preferably random target site insertion.

[0146] The protein is isolated from a biological preparation produced invivo or in vitro. Thus, the protein is purified or substantiallypurified from cellular components with which it is found in vivo. Whenproduced in vitro, the protein may also be purified or substantiallypurified from the other components used to produce it.

[0147] In preferred embodiments the protein is the TnsC protein (SEQ IDNOS:1 and 2).

[0148] In a specific disclosed embodiment, the protein contains a valineat amino acid number 225.

[0149] The invention is also directed to compositions containing theprotein described herein and the transposable element substrate on whichthe protein acts to cause insertion.

[0150] Compositions can also include target DNA into which thetransposable element is capable of being inserted.

[0151] The mutant proteins of the present invention include thenaturally occurring proteins encoded by a transposon as well as anysubstantially homologous and/or functionally equivalent variantsthereof. By “variant” protein is intended a protein derived from thenative protein by deletion (so-called truncation) or addition of one ormore amino acids to the N-terminal and/or C-terminal end of the nativeprotein; deletion or addition of one or more amino acids at one or moresites in the native protein; or substitution of one or more amino acidsat one or more sites in the native protein. Such variants may resultfrom, for example, genetic polymorphism or from human manipulation.Methods for such manipulations are generally known in the art.

[0152] For example, amino acid sequence variants of the polypeptide canbe prepared by mutations in the cloned DNA sequence encoding the nativeprotein of interest. Methods for mutagenesis and nucleotide sequencealterations are well known in the art. See, for example, (37) (38) (39)(40); U.S. Pat. No. 4,873,192; and the references cited therein; hereinincorporated by reference. Guidance as to appropriate amino acidsubstitutions that do not affect biological activity of the protein ofinterest may be found in the model of Dayhoff et al. (1978) (41) inAtlas of Protein Sequence and Structure (Natl. Biomed. Res. Found.,Washington, D.C.), herein incorporated by reference. Conservativesubstitutions, such as exchanging one amino acid with another havingsimilar properties, may be preferred.

[0153] In constructing variants of the protein of interest,modifications to the nucleotide sequences encoding the variants will bemade such that variants continue to possess the desired activity.Obviously, any mutations made in the DNA encoding the variant proteinmust not place the sequence out of reading frame and preferably will notcreate complementary regions that could produce secondary mRNAstructure. See EP Patent Application Publication No. 75,444.

[0154] Thus nucleotide sequences of the invention and the proteinsencoded thereby include the naturally occurring forms as well asvariants thereof. The variant proteins will be substantially homologousand functionally equivalent to the native protein. A variant of a nativeprotein can be “substantially homologous” to the native protein when atleast about 80%, more preferably at least about 90%, and most preferablyat least about 95% of its amino acid sequence is identical to the aminoacid sequence of the native protein. However, substantial homologyincludes high homology in the catalytic or other conserved functionalregions with possible low homology outside these. By “functionallyequivalent” is intended that the sequence of the variant defines a chainthat produces a protein having substantially the same biological effectas the native protein of interest. Thus, for purposes of the presentinvention, a functionally equivalent variant will confer the phenotypeof activating transposition with reduced target site specificity,preferably random. Such functionally equivalent variants that comprisesubstantial sequence variations are also encompassed by the invention.

[0155] The invention also encompasses compositions containing atransposable element containing DNA sequence encoding an ATP-utilizingregulatory protein, the protein containing a mutation that confersreduced target site specificity and preferably random insertion.

[0156] In preferred embodiments of the invention, the transposableelement is a Tn7 transposable element.

[0157] In specific disclosed embodiments, the mutation is valine asamino acid number 225 in the TnsC protein.

[0158] The invention also encompasses compositions containing theabove-described transposable element and a given DNA segment intended tobe the target for insertion of the transposable element.

[0159] The invention, accordingly, is directed to DNA into which hasbeen inserted the transposable element containing the mutation describedherein that confers simple, efficient insertion with reduced target sitespecificity or random target site insertion. The DNA in thiscomposition, in one embodiment, is capable of being introduced into acell in which it can exist as an extrachromosomal element or as anintegration element into cellular DNA.

[0160] The invention is also directed to DNA segments encoding themutant proteins disclosed herein, vectors containing these segments andhost cells containing the vectors. The vectors containing the DNAsegments may be used to propagate (i.e. amplify) the segment in anappropriate host cell and/or to allow expression from the segment (i.e.an expression vector). The person of ordinary skill in the art would beaware of the various vectors available for propagation and expression ofa cloned DNA sequence. In a preferred embodiment, a DNA segment encodingmutant TnsC protein is contained in a plasmid vector that allowsexpression of the protein and subsequent isolation and purification ofthe protein produced by the recombinant vector. Accordingly, theproteins disclosed herein can be purified following expression from thenative transposon, obtained by chemical synthesis, or obtained byrecombinant methods.

[0161] Relevant compositions, accordingly, include expression vectorsfor the mutant protein alone or in combination with expression vectorsfor the other proteins necessary for insertion of a transposableelement. Such compositions may further comprise the transposable elementto be acted upon by the proteins. Such mixtures are useful for achievingin vivo insertion, among other things.

[0162] The invention further encompasses kits containing theabove-described compositions.

[0163] Tn7 can be obtained as strain ATCC 29181; a K-12 derivativecarrying the resistance transfer factor R483; originally identified ascarrying a transposon in Barth et al. J. Bacteriol. 125:800-810 (1976).The sequence of Tn7 is Genbank entry ISTN7TNS, Assession no. X17693;reported in Flores et al. Nucleic Acids Res. 18:901-11 (1990).

[0164] Having now generally described this invention, the same will befurther described by reference to certain specific examples which areprovided herein for purposes of illustration only and are not intendedto be limiting unless otherwise specified.

Experimental EXAMPLE 1

[0165] Materials and Methods

[0166] Media, Chemicals, and Enzymes:

[0167] LB broth and agar were prepared as described (42). Trimethoprimselection was on Isosensitest agar (Oxoid). Lac phenotypes wereevaluated on MacConkey lactose agar (Difco). Antibiotic concentrationsused were 100 μg/ml carbenicillin (Cb), 30 μg/ml chloramphenicol (Cm),7.5 μg/ml gentamycin (Gn), 50 μg/ml kanamycin (Km), 10 μg/ml nalidixicacid (Nal), 20 μg/ml tetracycline (Tet) and 100 μg/ml trimethoprim (Tp).Hydroxylamine was purchased from Sigma. DNA modifying enzymes werepurchased from commercial sources and used as recommended by themanufacturer.

[0168] Bacterial Strains, Phages and Plasmids:

[0169] BR293 is E. coli F A(lac-pro) thi rpsL Δ(gal-λG)+lacZ pL cI+434pRS₇ (43) (44). BR293 is identical to NK8027 (45), and was provided byNancy Kleckner. NLC51 is E. coli F⁻ araD139 Δ(argF-lac) U169 rpsL150relA1 flbB5301 deoC1 ptsF25 rbsR Val^(R) recA56 (46). CW51 is E. coli F⁻ara arg Δlac-proXIII recA56 Nal^(R) Rif^(R) (11). λKK1 is lambda 780hisG9424::Tn10 del16 del17::attTn7::miniTn7-Km^(R) (47). Tnstransposition proteins were provided by pCW15 (tnsABC), pCW23 (tnsD),pCW30 (tnsE), or pCW4 (tnsABCDE) (11). Target plasmids were derivativesof pOX-G, a conjugable derivative of the F plasmid that carries Gn^(R)(48). pOX-attTn7 carries a (−342 to +165) attTn7 sequence (16). Theimmune plasmid pOX-attTn7 EP-1::miniTn7-Cm^(R) was made by transposingminiTn7-Cm^(R) (47) onto pOX-attTn7 using TnsABC+E to direct theinsertion into a non-attTn7 position. Construction of the immune targetplasmid pOX-G: :miniTn7-dhfr is described below. The transposon donorplasmid for the papillation assay was pOX-G::miniTn7lac, containingpromoterless lacZY between the transposon ends (50). The high copytransposon donor for mating-out assays was pEMΔ, containingminiTn7-Km^(R) (23).

[0170] Manipulation and Characterization of DNA:

[0171] Phage and plasmid isolation, transformation, and standard cloningtechniques were performed as described in (40). Conjugation and P1transduction were performed as described in (42). DNA sequencing wasdone on an automated ABI sequencer. Two plasmids were constructed inthis work: (1) pOX-G::miniTn7-dhfr. MiniTn7-dhfr was constructed byreplacing the Km^(R) cassette in pLA1 (16) with a dhfr cassette frompSD511 (28), which had been amplified by PCR to add flanking SalI sites.The PCR fragment was ligated into the TA vector (Invitrogen), the dhfrcassette was then removed by SalI digestion and inserted into the SalIsite of pLA1, replacing the Km^(R) gene. The resulting plasmid wastransformed into NLC51+pCW4+pOX-G, and grown for several days to allowtransposition to occur. pOX-G plasmids which had received a miniTn7-dhfrinsertion were identified by mating into CW51 and selecting for Tp^(R).

[0172] Mutagenesis of tnsC:

[0173] The TnsABC plasmid pCW15 was exposed to 1M hydroxylaminehydrochloride in 0.45 M NaOH (final pH approximately 7.0) at 37° C. for20 hours (ROSE et al. 1990). The DNA was recovered by multiple ethanolprecipitations, and PvuII-SphI fragments containing mutagenized tnsCwere subcloned into untreated pCW15, replacing the wild-type TnsC (SEQID NO.1). These plasmids were then introduced intoCW51+pOX-G::miniTn7lac by electroporation, and transformants wereselected on MacConkey lactose plates containing Cm. The plates wereincubated at 30° C. for 3-4 days, and screened for the emergence of Lac⁺papillae, indicating transposition of miniTn7lac.

[0174] λ Hop Transposition Assay:

[0175] Tn7 transposition was evaluated in NLC51 strains into which tnsfunctions were introduced by transformation, and pOX-G was introduced byconjugation (for FIG. 5). The protocol of (47) was followed: Cells weregrown in LB and 0.2% maltose at 37° C. to an OD₆₀₀ of 0.4-0.6 and thenconcentrated to 1.6×10⁹ cells/ml by centrifugation and resuspension in10 mM MgSO₄. 0.1 ml cells were combined with 0.1 ml λKK1 containingminiTn7-Km^(R) at a multiplicity of infection of 0.1 phage per cell. Theinfection proceeded for 15 min at 37° C., and was terminated by theaddition of 10 mM sodium citrate in 0.8 ml LB. Cells were allowed torecover with aeration for 60 minutes at 37° C., and then spread onplates containing Km and citrate. Transposition frequency is expressedas the number of Km^(R) colonies/pfu of λKK1.

[0176] Mating-out Transposition Assay:

[0177] Tn7 transposition was evaluated in the derivatives of BR293 usedto monitor SOS induction (Table 3), or in NLC5 1 strains into which tnsfunctions were introduced by transformation, and pOX-G orpOX-G::miniTn7-dhfr were introduced by conjugation. MiniTn7-Km^(R) waspresent in the NLC51 strains either in the chromosomal attTn7 site (FIG.4 and Table 1) or the high copy plasmid pEMΔ (Table 2) (SEQ ID NO:4).The protocol was adapted from (11): The donor strains described aboveand the recipient strain CW51 were grown at 37 to an OD₆₀₀ of 0.4-0.6with gentle aeration. Donors and recipients were mixed at a ratio of1:5, and growth was continued for another hour. Mating was disrupted byvigorous vortexing, and the cells were diluted and plated. The totalnumber of exconjugants was determined by selection on GnNal plates.Tn7-containing exconjugants were selected on TpNalplates, andminiTn7-Km^(R) exconjugants were selected on KmNal plates. Transpositionfrequencies are expressed as the number of Tp^(R)- orKm^(R)-exconjugants/total number of exconjugants.

[0178] Results

[0179] Isolation of the TnsC gain-of-function Mutants:

[0180] To focus on the relationship of TnsC and the target DNA, theinventor isolated gain-of-function TnsC mutants that activated theTnsA+B transposase in the absence of TnsD or TnsE. Since overexpressionof wild-type TnsC does not relieve the requirement for TnsD or TnsE(11), these gain-of-function mutations were predicted to affect thebiochemical properties of TnsC, rather than its expression or stability.

[0181] A visual assay for Tn7 transposition (50) (51) was used toidentify mutants. This assay uses a miniTn7lac element which carriespromoterless lacZY genes between the cis-acting sequences at thetransposon ends. The miniTn7lac element is located in a transciptionallysilent position on a donor plasmid; cells containing this plasmid arephenotypically Lac⁻. When Tns functions are provided in trans,miniTn7lac can transpose to new sites in the E. coli chromosome. Some ofthose transposition events place the element downstream from activepromoters, resulting in increased lacZ expression. This is observed onMacConkey lactose color indicator plates as the emergence of red (Lac⁺)papillae in an otherwise white (Lac⁻) colony. Therefore, the number ofpapillae reflects the amount of transposition which occurred during thegrowth of that colony.

[0182] Cells containing miniTn7lac and various Tns functions werepatched on color indicator plates (FIG. 1). Virtually no Lac+papillaewere seen in cells containing only TnsABC^(wt). Cells containingTnsABC^(wt)+E produced many Lac⁺ papillae. Southern blottingdemonstrated that TnsABC^(wt)+E papillae result from translocations ofminiTn7lac to a variety of chromosomal locations rather than fromintramolecular rearrangements of the donor plasmid (50). MostTnsABC^(wt)+D events are silent because there is no appropriatelyoriented promoter adjacent to attTn7 (50) (52).

[0183] This visual assay was used to screen for TnsC mutants that hadacquired the ability to activate Tn7 transposition in the absence ofTnsD or TnsE. Randomly mutagenized tnsC was cloned into a plasmidcontaining tnsAB. These tns genes were introduced into cells containingminiTn7lac. Six gain-of-function TnsC mutants were identified (FIG. 1).

[0184] Transposition activated by these TnsC mutants still required theTnsA+B transposase and intact transposon ends. The papillationphenotypes of the TnsC mutants varied considerably, suggesting thatdifferent mutants were activating different amounts of miniTn7lactransposition. Several TnsC mutants promoted more transposition thanTnsABC^(wt)+E. TnsABC^(S401YΔ402) achieved the highest level oftransposition.

[0185] The amino acid changes responsible for the mutant phenotypes weredetermined by DNA subcloning and sequencing. tnsC encodes a protein of555 amino acids, with Walker A and B motifs in the amino-terminal halfof the protein (53). Walker A and B motifs have been implicated bystructural and mutational analyses to be directly involved in nucleotidebinding and/or hydrolysis in a variety of ATPases and GTPases (37) (55).

[0186] The tnsC mutations primarily result in single amino acidsubstitutions whose locations are scattered across the TnsC proteinsequence (FIG. 2). TnsC mutants segregate into two phenotypic classes.Transposition reactions activated by Class I mutants are sensitive toimmune targets and the target selection factors TnsD and TnsE.Transposition reactions activated by the Class II mutants are impairedin their responses to these signals. The residues affected in two of themutants (TnsC^(A225V) and TnsC^(E233K)) lie in or very close to theWalker B motif.

[0187] TnsC Mutants Promote Intermolecular Transposition:

[0188] The papillation assay is a powerful screen for transpositionactivity, but it does not necessarily report intermoleculartransposition events. Internal rearrangements of the miniTn7lac donorplasmid, which fortuitously place the miniTn7lac element downstream froma promoter, would also produce Lac⁺ papillae. Therefore, the inventorinvestigated whether the TnsC mutants facilitate the TnsA+B transposaseto do intramolecular recombination, or whether the mutants promoteintermolecular transposition.

[0189] The λ hop assay measures the translocation of a miniTn7-Km^(R)element from a replication- and integration-defective λ phage to thebacterial chromosome during a transient infection. The miniTn7-Km^(R)element carries a kanamycin resistance cassette with a constitutivepromoter. Therefore, the λ hop assay reports the total number oftransposition events occurring into the chromosome. TnsABC^(wt) had nodetectable transposition activity in the λ hop assay. TnsABC^(wt)+Egenerated 2.2×10⁻⁷ Km^(R) colonies/pfu (FIG. 3). Transposition promotedby TnsABC^(wt)+D generated 1.8×10⁻⁴ Km^(R) colonies/pfu. All of the TnsCmutants could promote the translocation of miniTn7-Km^(R).TnsABC^(A225V) and TnsABC^(S404YΔ402) promoted 8- and 50-fold moretransposition than TnsABC^(wt)+E. Other TnsC mutants promotedtransposition, although not at such levels.

[0190] The mating-out assay was used to explore the ability of the TnsCmutants to promote translocations into a different type of targetmolecule. This assay measures the frequency of transposition ofminiTn7-Km^(R) from the chromosome to pOX-G, a conjugable derivative ofthe E. coli F factor. The TnsABC^(wt)+E machinery preferentially selectsconjugable plasmids as targets for transposition, whereas theTnsABC^(wt)+D machinery does not recognize pOX-G unless it containsattTn7 sequences (ROGERS et al. 1986, WADDELL and CRAIG 1988, WOLKOW etal. 1996). The TnsC mutants could promote transposition to pOX-G (FIG.4). Thus, the results demonstrate that the gain-of-function TnsC mutantscan promote intermolecular transposition.

[0191] Effects of the Target Selection Factors TnsD and TnsE:

[0192] Frequencies of transposition of miniTn7-Km^(R) from a lambdaphage to the chromosome and/or pOX-G were measured in strains containingTnsA+B and the TnsC mutants, either alone or in combination with TnsD orTnsE. The preferred target for TnsE reactions, pOX-G, was introduced byconjugation into strains containing TnsC mutants or the TnsCmutants+TnsE. The distribution of miniTn7-Km^(R) insertions between thechromosome and pOX-G was determined by mating the pOX-G plasmids fromthe Km^(R) products of a lambda hop assasy into the Km^(s) strain CW51,and testing whether Km resistance was plasmid linked. No transpositionwas detected in strains containing TnsABC^(wt) alone orTnsABC^(E233K)+TnsD.

[0193] Response to the Target Selectors TnsD and TnsE:

[0194] TnsD and TnsE are required to activate the TnsABC^(wt) machineryand to direct transposition into particular target DNAs mutant (10) (11)(13) (14). The TnsABC^(mutant) machineries, by definition, do notrequire the inputs of TnsD or TnsE. However, the inventor investigatedwhether TnsD or TnsE could influence the frequencies or distribution oftransposition events promoted by the TnsC mutants.

[0195] The λ hop assay was used to evaluate the effects of TnsD and anavailable attTn7 site on transposition promoted by the TnsC mutants. Allof the mutant reactions were responsive to TnsD+attTn7, but thoseresponses varied widely. Reactions activated by TnsABC^(A225V) andTnsABC^(E273K) were strongly stimulated by TnsD+attTn7, promoting 500-and 5000-fold more transposition, respectively, in the presence ofTnsD+attTn7 than with TnsABC^(A225V) or TnsABC^(E273K) alone. Theremaining mutant reactions were less profoundly influenced by TnsD:TnsABC^(S401F) reactions showed a moderate (50-fold) stimulation.Reactions activated by TnsC^(E233K), TnsC^(S401YΔ402) and TnsC^(A282T)were somewhat inhibited in the presence of TnsD.

[0196] The effects of TnsE was also studied by the λ hop assay. In theabsence of TnsE, the vast majority of the TnsABC^(mutant) transpositionevents were targeted to the chromosome. In the presence of TnsE,preferential insertion into pOX-G was observed with some of the TnsCmutants.

[0197] These differential responses suggest that the six TnsC mutantsare not activating Tn7 transposition through a single mechanism.Instead, the mutants can be segregated into two classes, based on theirability to respond to TnsD and TnsE. Transposition activated by theClass I mutants—TnsC^(A225V) and TnsC^(E273K)—can by stimulated by TnsDand targeted to pOX-G by TnsE. Transposition activated by the Class IImutants—TnsC^(E233K), TnsC^(A282T), TnsC^(S401YΔ402) and TnsC^(S401F)—isnot responsive to the positive effects of TnsD or TnsE or both. By thesecriteria, TnsC^(S401F) is proposed to be a member of Class II: althoughTnsC^(S401F)-activated reactions are somewhat stimulated by TnsD, thedistribution of insertions in TnsC^(S401F)-activated reactions is notaffected by TnsE. The grouping of the TnsC mutants into these twoclasses is supported by the differential responses of theTnsABC^(mutant) reactions to immune targets, as described below.

[0198] Discussion

[0199] Proteins Involved in Target Evaluation:

[0200] How is an appropriate target for Tn7 transposition identified?The inventor has hypothesized that TnsC may serve as a “connector” or“matchmaker”, linking the transposase and the target DNA in a mannerregulated by the ATP state of TnsC (23) (27). TnsC has the biochemicalproperties necessary for that connection: it can directly interact withtarget DNA (24) and with the TnsA+B transposase (A. STELLWAGEN and N. L.CRAIG, unpublished results). However, wild-type TnsC (SEQ ID NO:1) isnot sufficient to activate transposition. Instead, Tn7 transposition isdependent on TnsD or TnsE to activate the TnsABC^(wt) machinery andselect a target site. TnsD is an attTn7 binding protein (23) whichrecruits TnsC to this target. The resulting TnsC-TnsD-attTn7 complex canthen attract the transposase in vitro (23). The mechanism by which TnsEactivates transposition is not yet known. TnsE might be preferentiallylocalized to conjugating plasmids and subsequently recruit TnsC to thosemolecules, or TnsE might modify TnsC so that TnsC's binding activity isnow directed to those targets. Alternatively, TnsE might modify thetransposase directly, without proceeding through TnsC. The resultssuggest that TnsD and TnsE provide alternative inputs into TnsC, whichin turn recruits the TnsA+B transposase to the target DNA.

[0201] The successful isolation of TnsC gain-of-function mutants revealsthat the TnsABC machinery is capable of engaging target DNA andpromoting insertions without TnsD or TnsE. However, the mutant reactionshave not mimicked the abilities of TnsD or TnsE to direct transpositioninto particular targets: transposition activated by the TnsC mutantsdoes not show the preferential insertion into conjugable plasmids seenwith TnsE-activated reactions, nor the attTn7 specificity ofTnsD-activated reactions. Therefore, TnsD and TnsE are essential torecognize these positive target signals.

[0202] TnsC appears to receive a variety of inputs—from TnsD, TnsE andfrom immune targets—which control its activity. The activity of TnsC canalso be influenced by mutation. Six gain-of-function TnsC point mutantshave been described in this work, which segregate into two classes. Thefact that different classes of TnsC mutants with different transpositionactivities were recovered is consistent with the hypothesis that thereare multiple routes to activating TnsC. The Class I mutants,TnsC^(A225V) and TnsC^(E273K), enable the TnsABC machinery to executetransposition without sacrificing its ability to respond to bothpositive and negative target signals. Both are substantialgain-of-function mutants, with TnsABC^(A225V) promoting eight-fold moretransposition to the chromosomes than TnsABC^(wt)+E (FIG. 3).Transposition activated by these Class I mutants can be profoundlystimulated by TnsD+attTn7, or directed to conjugable plasmids by TnsE,as well as being able to discriminate between immune and non-immunetargets. Thus, the gain-of-function phenotypes seen with the Class Imutants have been achieved while preserving the ability of these TnsCsto transduce information between the target DNA and the transposase.

[0203] The TnsC mutants which fall into the second class behave muchmore like constitutively activated versions of TnsC. Some of thesemutants also promote considerable amounts of transposition:TnsABC^(S401YΔ402) results in 50-fold more transposition to thechromosomes than TnsABC^(wt)+E (FIG. 3). However, the nature of thetransposition reactions promoted by the Class II TnsC mutants is quitedifferent than those seen with the Class I mutants. Immune andnon-immune targets are used essentially equivalently in reactions withthe Class II mutants, and TnsD and TnsE are not able to profoundlyinfluence the frequency or distribution of these transposition events. Asimilar loss of responsiveness to target signals is seen when Tn7transposition is activated by nonhydrolyzable ATP analogs in vitro.Transposition can still occur when TnsC's ATPase activity is blockedwith AMP-PNP, but those transposition events no longer require TnsD andare no longer targeted to attTn7 (BAINTON et al. 1993). Instead, any DNAmolecule, including immune targets, can serve as a target for Tn7insertion. Thus, TnsABC transposition can be constitutively activated byAMP-PNP or by the Class II TnsC mutants. It is noteworthy that the aminoacid affected in TnsC^(E233K) lies in one of TnsC's ATP motifs.

[0204] Comparison to Other Elements:

[0205] The use of an ATP-dependent protein such as TnsC to regulatetarget site selection is not unique to Tn7. Bacteriophage Mutransposition is also profoundly influenced by its ATP-utilizing proteinMuB. MuB is an ATP-dependent DNA binding protein (57) (MAXWELL et al.1987) which is required for efficient transposition in vivo (58) (59).In vitro, the MuA transposase preferentially directs insertions intotargets that are bound by MuB (60) (61) (19). Although there is noparticular sequence specificity to MuB binding, its distribution on DNAis not random: MuB binding to target molecules that already contain Musequences is specifically destabilized through an ATP-dependentmechanism (19). Therefore Mu, like Tn7, recognizes and avoids immunetargets; moreover, MuB and TnsC^(A225V) appear to play functionallysimilar roles in regulating transposition.

[0206] Mu and Tn7 belong to a family of transposons which encodeproteins with ATP binding/hydrolysis motifs; other members include IS21(35) (62), Tn552 (36), Tn5053 (33), and Tn5090 (34). Therefore, thestrategy of using an ATP binding protein to regulate target siteselection may extend to the entire family. Tn5053 is particularlyinteresting, since it encodes three proteins which are required for itstransposition: a presumptive transposase containing a D, D(35)E motifcharacteristic of transposases and integrases, a potential regulatoryprotein containing Walker A and B motifs, and a third protein of unknownfunction (33). Tn5053 shows some degree of target site specificity,inserting predominantly into the par locus of the conjugable plasmidRP4. It is tantalizing to speculate that the third protein of Tn5053 isa target selector, like TnsD or TnsE, directing insertions into the parlocus.

[0207] The inventor's work has illustrated the role of target DNA incontrolling Tn7 transposition in vivo, and has strongly implicated TnsCas a central player in this regulation. Single amino acid changes inTnsC can disrupt the communication between the transposon and the targetsite, reducing the stringency of Tn7 s target site selectivity. TnsDpromotes Tn7 insertion at high frequency into attTn7, a safe haven inthe bacterial chromosome, whereas TnsE allows Tn7 access to conjugableplasmids, and thus a means to spread through bacterial populations.Avoidance of immune targets also promotes the spread of the element,rather than local hopping, and prevents one Tn7 element from insertinginto another. TnsC may integrate all of these target signals, andcommunicate that information to the transposase. TABLE 1 TnsC^(A225V)promotes intermolecular transposition Tns functions Transpositionfrequency TnsABC^(wt) <10⁻⁷ TnsABC^(A225V) 8.8 (±8.1) × 10⁻⁶TnsABC^(wt)DE 5.5 (±1.1) × 10⁻⁴

[0208] Frequencies of transposition of miniTn7-Km^(R) from a high copyplasmid to pOX-G were determined using the mating-out assay, and areexpressed as the number of Km^(R) exconjugants/total exconjugants. Eachvalue is the average of three independent measurements.

EXAMPLE 2

[0209] Materials and Methods

[0210] Media, Chemicals, and Enzymes

[0211] Luria broth (LB) and agar were prepared as described by (42).Carbenicillin and kanamycin selections were carried out at aconcentration of 100 μg/ml. DNA modifying and restriction enzymes werepurchased from commercial sources and used according to manufacturer'sinstructions. Taq polymerase was purchased from Boehringer MannheimBiochemicals.

[0212] Bacterial Strains and Plasmids

[0213] Tn7 donor plasmids contain a miniTn7 element in which the minimalend sequences of Tn7 (Tn7L 1-166 and Tn7R 1-199) flank a selectablemarker. A pBR plasmid containing a mTn7-kanamycin element with NotI andSpeI sites at the ends of the kanamycin cassette has been shown to be aneffective donor. When transposition products are to be recovered bytransformation, it is useful to prevent transformation of unreacteddonor. One strategy is to cut the donor backbone with a restrictionenzyme that does not cut within the Tn7 element or within the targetDNA. Another strategy is to use donor plasmids that will not replicatewith the products recovered. One strategy is to make the replication ofthe donor depend on a protein that is not present in the transformationstrain. For example, the mTn7 element can be placed on a plasmid whichdoes not itself encode an initiator protein for replication. Aparticular example is to make the donor backbone an R6K plasmid thatdoes not encode the replicator protein pir. The R6Kpir-miniTn7 plasmidcan then be grown in a strain which contains pir (supplied for exampleby a heterologous plasmid) and the transposition mixture transformedinto a strain lacking pir. With selection for the marker on the mTn7,only insertions into the target DNA will be recovered. SubcloningEfficiency DH5alpha competent cells were purchased from GIBCO BRL andused according to the manufacturer's instructions.

[0214] The target plasmid pRM2 (SEQ ID NO:6) contains bases -342 to +165of attTn7 cloned into pUC18 [47]. The donor plasmid pEMA (SEQ ID NO:4)carries a miniTn7 element comprised of the 166 terminal bases of theleft end of Tn7 and 199 bases of the right end flanking a geneconferring resistance to kanamycin [23].

[0215] Tns Proteins

[0216] The purification of TnsA and TnsB-His are described in (63). TnsAwas stored in 25 mM Hepes (pH 8.0), 150 mM NaCl, 1 mM EDTA, 1 mM DTT, 5%glycerol at −80° C. TnsB was TnsB-His, a derivative containing aC-terminal polyhistidine tag, and was stored in 25 mM Hepes (pH 8.0),500 mM KCl, 2 mM DTT, 1 mg/ml BSA, 25% glycerol at −80° C. Thepurification of TnsC and TnsC^(A225V) is a modified procedure from (24)which is described in (25) (26). Both proteins were stored in 25 mMHepes (pH 8.0), 1M NaCl, 2.5 mM DTT, 1 mM ATP, 10 mM MgCl₂, 0.1 mM EDTA,10 mM CHAPS, 10% glycerol at −80° C. TnsD was TnsD-His (P. Sharpe and N.Craig, in preparation), a derivative containing a C-terminalpolyhistidine tag, and was purified by Ni⁺² chromatography before beingstored in 50 mM Tris (pH 7.5), 2 mM DTT, 500 mM KCl, 1 mM EDTA and 25%glycerol at −80° C.

[0217] Transposition Reactions in vitro

[0218] Transposition reactions are adapted from the standard in vitroreaction described in (23). Reaction mixtures, 100 μl in volume,contained (final concentration) 0.25 nM pEMΔ donor, 1.9 nM pRM2 target,26 mM Hepes, 4.2 mM Tris (pH 7.6), 50 μg/ml BSA, 100 μg/ml yeast tRNA, 2mM ATP (pH 7.0), 2.1 mM DTT, 0.05 mM EDTA, 0.2 mM MgCl₂, 0.2 mM CHAPS,28 mM NaCl, 21 mM KCl, 1.35% glycerol, 60 ng TnsA, 25 ng TnsB, either100 ng TnsC^(wt), or 100 ng TnsC^(A225V), and 40 ng TnsD, unlessotherwise indicated, in a 30 minute preincubation at 30° C. (TnsA=19 nM,TnsB=3.1 nM, TnsC=16 nM, TnsD=6.5 nM). Magnesium acetate was added to afinal concentration of 15 mM and the reactions were allowed to proceedfor an additional 60 minutes at 30° C. Products were extracted with a1:1 mixture of phenol/chloroform, ethanol-precipitated, and resuspendedin water in preparation for subsequent analyses.

[0219] PCR Primers and Amplification

[0220] Oligonucleotides used for the various PCR amplifications toanalyze the products of transposition are:

[0221] NLC95 (SEQ ID NO.7): (5′)-ATAATCCTTAAAAACTCCATTTCCACCCCT-(3′)

[0222] NLC209 (SEQ ID NO.8): (5′)-GTGATTGCACCGATCTTCTACACCGTTCC -(3′)

[0223] NLC429 (SEQ ID NO.9): (5′)-TTTCACCGTCATCACCGAAACGCGCGAGAC-(3′)

[0224] NLC430 (SEQ ID NO.10): (5′)-AATGACTTGGTTGAGTACTCACCAGTCACA-(3′)

[0225] NLC431 (SEQ ID NO.11): (5′)-ATGAACGAAATAGACAGATCGCTGAGATAG-(3′)

[0226] NLC432 (SEQ ID NO.12): (5′)-CAAGACGATAGTTACCGGATAAGGCGCAGC-(3′)

[0227] Two percent of the 100 μl transposition reaction was used as thetemplate in a given PCR amplification. 100 pg of plasmid pMCB20 was usedwhen amplifying a marker product for size comparison on the highresolution denaturing gels. 30 temperature cycles of 94° C. for 1.0minute, 55° C. for 1.5 minutes, and 72° C. for 1.5 minutes were run forall amplifications, followed by a single 5 minute incubation at 72° C.The buffer composition and quantity of Taq polymerase recommended by themanufacturer (Boehringer Mannheim Biochemicals) were used for allreactions. PCR products were ethanol-precipitated, resuspended in water,and loaded on a high resolution denaturing gel.

[0228] Probe Labeling

[0229] Oligonucleotide probes were 5′ end-labeled with [gamma-³²P] ATPsubstrate and bacteriophage T4 polynucleotide kinase for 45 minutes at37° C. Labeled probes were separated from unincorporated label by sizeexclusion through a G50 Nick Spin Column (Pharmacia).

[0230] High Resolution Denaturing Gels

[0231] The resuspended PCR products were electrophoresed on either a 5%or 6% polyacrylamide denaturing gel and electrotransferred to GeneScreen Plus membrane (du Pont). The resulting blots were visualized byhybridization with an appropriate oligonucleotide probe at 50° C. andexposed overnight to phosphorimager screens (Molecular Dynamics), whichwere scanned the following day.

[0232] Results

[0233] TnsC^(A225V) Supports Efficient Transposition in vitro

[0234] A diagram of Tn7 transposition is shown in FIG. 5. Tn7 mobilizesvia a cut-and-paste mechanism, whereby both ends of the element arefirst excised from the donor backbone by double-strand breaks, and jointo the target DNA most likely via transesterification reactions to formsimple insertions with short gaps at either end. Other possibleintermediates of a transposition reaction are double-strand breaks(DSBs), where one end of the transposon has been excised but the otherend remains attached to the donor backbone, excised linear transposons(ELTs), where both ends have been excised from the donor and neither endhas joined to the target, and double-strand break, single-end joins(DSB-SEJs), where one transposon end has been broken in the donor andjoined to the target molecule.

[0235] The Tn7 transposition reaction has been reconstituted in vitro,in which purified Tns proteins promote the transposition of a mini Tn7element from a donor plasmid into an attTn7-containing target plasmid(Bainton 1993). TnsABC^(wt)+D supports this reaction with greatefficiency. In the absence of TnsD, TnsABC^(wt) does not generate adetectable level of insertion products (FIG. 6, lane 2) althoughdouble-strand break intermediates are seen upon prolonged incubation. Bycontrast, reactions containing TnsABC^(A225V) show a dramaticaccumulation of simple insertions, at efficiencies that approachTnsABC^(wt)+D reactions (FIG. 6, lane 5). Neither the TnsABC^(A225)V northe TnsABC^(wt)+D reactions generate visible levels of DSB-SEJ products,indicating that the vast majority of Tn7 transposition events result inthe complete (i.e., two-ended) insertion of the transposon into thetarget DNA, rather than a single-ended insertion event.

[0236] TnsABC+D transposition is not only efficient, it is also verytarget site-specific. TnsABC+D insertions occur almost exclusively intothe attTn7 site present on the target plasmid (Bainton, et al., 1993,data not shown). By contrast, the TnsABC^(A225V) insertions are notlimited to the attTn7 site. Alternative restriction analysis of theTnsABC^(A225V) reaction yields a smear of products on an agarose gel(data not shown), suggestive of a population of insertions located atmany different positions in the target plasmid. To investigate thedistribution of these insertions, we subjected the TnsABC^(A225V)reaction products to high-resolution analysis, as described below.

[0237] Distribution of TnsABC^(A225V)-Mediated Insertions is HighlyNonspecific

[0238] A PCR-based approach has been used to analyze insertionalmutations in SV40 and yeast TRP1ARS1 minichromosomes [30, 31], andperform functional analyses of insertional mutations in yeast chromosomeV and the E. coli supF gene [Smith, 1996 #427] and [32], respectively.

[0239] PCR was utilized to survey the distribution of TnsABC^(A225)Vinsertions previously seen on the agarose gel at higher resolution. Thediagram in FIG. 7 illustrates the PCR strategy used to amplify thepopulation of insertion products present in a TnsABC^(A225V) reaction,with two representative insertions being shown as examples. One PCRprimer (NLC95) (SEQ ID NO:7) hybridizes within the cis-acting endsequence of the inserted element and the other (NLC209) (SEQ ID NO:8)hybridizes to an arbitrary position in the target molecule. Thus, thelength of the PCR product reflects the positions of the insertions intothe target molecule.

[0240] Amplification of a pool of insertions generated a smear ofreaction products when displayed on an agarose gel, as expected (datanot shown). The PCR products were run on a 6% polyacrylamide denaturinggel to achieve single nucleotide resolution and visualized by Southernblotting and hybridization with a Tn7-specific probe (FIG. 7). Thestriking result is that the distribution of products is remarkablynonspecific. Insertions have occurred at nearly every base within thehighly resolved lower portion of the gel. PCR products of more thanroughly 200 bp in length are resolved poorly. Some areas of dense signalare seen in this region, potentially indicating preferential points ofinsertion. However, compression of bands could also account for theapparently singular products; analysis of these insertion products withother primer pairs supports this latter possibility (see below).

[0241] This confirms the inventor's hypothesis that the TnsABC^(A225V)machinery is capable of directing Tn7 transposition into the targetplasmid with high efficiency and low specificity.

[0242] Surveillance of the Entire Target Plasmid

[0243] In the experiments above, the focus was on a relatively shortregion of the target plasmid pRM2 (SEQ ID NO:6). It was demonstratedthat TnsABC^(A225V) can direct insertions into virtually every base pairof this region. To be certain that the phenomenon is not specific to theregion of the plasmid, a family of primers was synthesized, each ofwhich paired with a Tn7 end-specific primer to allow amplification ofall regions of pRM2 (SEQ ID NO:6). These primers are spaced atapproximately 500 bp intervals around the target plasmid and willamplify insertions in predominantly one orientation. FIG. 8 diagrams theamplicons for each primer pair and shows a denaturing gel Southern blotof the resulting PCR products. The results indicate that theC^(A225V)-mediated insertions do occur into positions all around thetarget plasmid. As was seen for the original amplicon analyzed, there isconsiderable variability in the strength of the signal for individualpoints of insertion, but insertions do occur at some level at everyposition. Thus, the TnsABC^(A225V) machinery does not appear to have aspecificity for any particular region of this target plasmid.

[0244] In another approach to investigating the possible sequencespecificity of TnsABC^(A225V) target site selection, 67 independentinsertions into a 12 kb plasmid were collected and analyzed.TnsABC^(A225V) transposition reactions using a target plasmid containingseveral E. coli genes were transformed into E. coli to selectkanamycin-resistant colonies. The target plasmids were then recoveredand sequenced to determine the position of each insertion. 62 out of the63 insertions were located in different positions on the target plasmid.A comparison of the sequences of these insertions supported our previousobservations that there is very little sequence specificity governingthe selection of TnsABC^(A225V) target sites. Attempts to derive aconsensus sequence for the 5 bp target site duplication sequencerevealed a faint preference for NYNRN (SEQ ID NO:14), but the bias isnot very compelling.

[0245] Exploiting the TnsABC^(A225V) Machinery for in vitro Mutagenesis

[0246] The high efficiency and low target specificity of theTnsABC^(A225V) transposition reaction makes this a useful system formutagenizing a variety of DNA targets. Insertional mutagenesis could beperformed on cosmid libraries, cDNA libraries, PCR products, BACs, YACs,and genomic DNAs, among others. The inventor has mutagenized pUC-basedplasmids, cosmids, BACs ranging in size from 5 to 120 Kb (data notshown), and H. influenzae genomic DNA (Gwinn et al., 1997). In fact, theinventor has not encountered DNA that cannot serve as a target forTnsABC^(A225V) transposition.

[0247] Once DNA targets have been successfully mutagenized in vitro, thesimple insertions will be recovered. For a simple insertion product tobecome a stable replicon, the 5′ nonhomologous overhangs trailing offboth ends of the inserted transposon must be removed, the gaps filledin, and the strands ligated. A simple method to perform such processingfunctions is to transform the pool of transposition products into a hostand rely on the host's repair machinery, selecting for atransposon-borne marker. In E. coli, the 5′ single-stranded overhangsand gaps on either end of the transposon after a simple insertion arereadily repaired by the host (see below). The donor plasmid for otherhosts could be customized in a number of ways to best facilitate therecovery of the desired insertional mutants.

[0248] The inventor recovered simple insertions into pRM2 in E. coli,since Tn7 insertions can be easily repaired in this host.

[0249] Simply transforming transposition reactions into host cells as amethod to recover simple insertion isolates presents a backgroundcontributed by donor molecules that have not undergone transposition andthus continue to carry the selectable marker on a stable replicon. Inorder to eliminate the background false positives that can complicate ascreen for insertional mutants, the ability of the unreacted donor totransform cells can be reduced. Two methods have been provided: 1)destruction of the donor plasmid's ability to replicate by restrictiondigestion prior to the transposition reaction, and 2) use of aconditional replicon origin in the donor backbone which renders thedonor incapable of replication in the cells being ultimately transformedwith the transposition pool.

[0250] For the first method, 5 identical TnsABC^(A225V) reactions werecarried out on linearized pMCB31 donor DNA paired with cosmid clone ES#3target DNA, an approximately 50 kb replicon which contains an insert ofgenomic DNA from E. tarda. Linearizing the plasmid will prevent thedonor plasmid from replicating once transformed into the host. Theproducts were pooled for the extraction and precipitation steps, andthen a portion of the resultant sample was transformed into BRLSubcloning Efficiency DH5-alpha competent cells. Assuming a 10% loss inthe recovery of the DNA after the transposition reaction, the efficiencyof transformation relative to μg of input donor DNA was approximately3.8×10⁴ colonies/μg/ml of cells. One-tenth of a microgram of donor DNAis typically used in a reaction, so by extension, if all of the productDNA from a single transposition reaction is transformed, 3800 coloniescould be isolated, an efficient mutagenesis. The DH5^(alpha) cells areadvertised to have a transformation efficiency of equal to or greaterthan 1×10⁷ colonies/μg supercoiled pUC19/ml of cells. Simply usinghigher efficiency cells or electroporation cells should yieldconsiderably higher numbers of isolates from a single transpositionreaction, and probably aid in picking up rarer events.

[0251] Another method employs a heterologous origin of replication onthe donor plasmid, for example, R6K. Replicons relying on this originmust be maintained in a host carrying a resident copy of the pir gene,which codes for the π protein, a necessary component for initiation ofreplication at R6K_(gamma) origins. Thus, it is simple to eliminatefalse positives stemming from unreacted donor molecules simply bytransforming the transposition products into pir cells, and relying onthe competent origin of replication in the target molecule for recoveryof simple insertion isolates. Transposition reactions employing thisdonor were prepared for transformation as described above.

[0252] It is conceivable that the larger plasmid (˜50 kb) would be moredifficult to transform after receiving an insertion because it would bea large open circular molecule approximately 10 times the size of thepRM2 (3.2 Kb) open circle with an insertion. To gain insight into thepossibility of target size limitations using the transformation methodof simple insertion recovery, transpositions of the miniTn7 element frompMCB40 into the two target plasmids were directly compared. Thetransformation efficiencies of the two reactions were very similar. Thedifferent targets were included at comparable concentrations in thetransposition reactions, but were not equimolor. The results suggestthat ES#3 simple insertions can transform the cells at nearly the sameefficiency as the smaller pRM2 simple insertions. It is difficult totest reaction conditions under which the cosmid target is available atthe same molarity as the pRM2 target because elevated levels of totalDNA in the reactions can compromise the reproducibility with which DNAis recovered after transposition.

[0253] The high transformation efficiencies demonstrate the utility ofthis reaction for a mutagenesis in which the simple insertion productscan be stably replicated in an E. coli host. This same type of protocolcould be used in other bacterial species and strains with development ofthe appropriate DNA substrates.

[0254] Discussion

[0255] TnsC^(A225V) Circumvents the Requirement for a Targeting Protein.

[0256] Tn7 demonstrates considerable diversity when it comes to targetsite selection. It has a sophisticated system for choosing either ahighly conserved “safe haven” in the E. coli chromosome (attTn7) orsomewhat random sites throughout a cell's genome or resident conjugableplasmid, mediating these different selections via alternative targetingproteins encoded by the element. In this way, Tn7 is significantlydifferent than all other well-characterized transposable elements, whosetarget site selections are mediated predominantly by either thetransposase alone (e.g., IS10/Tn10) or in conjunction with one otheraccessory protein (bacteriophage Mu). IS10/Tn10 selects a target sitevia a direct interaction of the Tn10 transposase with the target DNA. Ithas been demonstrated that particular mutations in the Tn10 transposaseare capable of altering target recognition features while leaving otherfunctions of the transposase unaffected (65). The bacteriophage Mu,however, encodes a transposase, MuA, and an ATP-dependent activator ofMuA, MuB. MuB functions as an accessory protein that, when complexedwith target DNA, attracts the MuA transposase to the site of insertion.It is likely that having more proteins involved has allowed Tn7 to bemore adaptive to environmental changes when choosing its new sites ofresidence, and ensured its survival by enabling it to employ a moretailored approach to disseminating itself amongst various cellpopulations.

[0257] This example has focused on the role of TnsC in the selection ofa target site. As discussed, TnsC has been implicated as the majorcommunicator between the TnsAB transposase bound to donor DNA, and theTnsD or TnsE targeting proteins, complexed with target DNA. Experimentshave shown that TnsC does have the capacity to bind DNA nonspecificallyin the absence of TnsD and TnsE (ref) but attempts to isolate simpleinsertions in vivo and in vitro in the absence of the targeting proteinsproved unsuccessful with wild-type TnsC (23). Isolation of theTnsC^(A225V) mutant, however, has permitted the inventor to circumventthis requirement and isolate simple insertions from reactions lackingTnsD and TnsE. Not only does the mutant facilitate the recovery ofsimple insertions, it does so very efficiently.

[0258] Ability of TnsC^(A225V) to Insert Nonspecifically

[0259] It is clear that TnsC^(A225V) has a considerable gain of functionover wildtype TnsC, as evidenced by the increased yield of simpleinsertions in a standard in vitro reaction (25) (26) (this example). Amore detailed evaluation was necessary to determine the actual sites ofinsertion because restriction digests of the product pools indicatedthat there is extensive variability in site selection relative toTnsD-mediated insertions, which are targeted almost exclusively to theattachment site. PCR amplification of pools of transposition productsfollowed by high resolution denaturing gel analysis of severalindependent reactions has revealed that the insertions into the pRM2target plasmid are detectable at every base visible within thewell-resolved portions of the gels. Although the target site selectionis not completely random (there are differences in band intensities),one possibility is that the nonspecific DNA binding activity of TnsC hasbeen enhanced in the TnsC^(A225V) mutant, giving the protein thecapacity to direct the TnsAB transposase to the wide variety ofinsertion sites observed.

[0260] It is possible that the TnsC^(A225V) mutation has altered TnsC insuch a way that it simulates a TnsC-TnsE complex, capable of insertionsat more random sites. Perhaps the role of TnsE is to strengthen TnsC'snonspecific interaction with the target DNA, thereby promotinginsertions into sites where TnsC and TnsE happen to complex. The abilityof TnsE to preferentially direct transposition to conjugating plasmids(14) holds true when TnsC^(A225V) is substituted for wild-type TnsC (SEQID NOS:1 and 2) (25) (26). This suggests that this mutation in TnsC doesnot compensate for all specific activities of a targeting protein. TheTnsABC^(A225V) reaction is also sensitive to the presence of the targetsite specific protein TnsD, as evidenced by a detectable increase in thefrequency of insertions when TnsD is present.

[0261] These observations may explain why Tn7 has chosen to preserve amore complicated target site selection mechanism. In a cell containingonly wildtype proteins, an extra layer of regulation can be exercisedwhen two proteins complex to direct insertions, and the result may beless deleterious to cell populations than the somewhat rampant levels ofinsertions observed in reactions with the TnsC^(A225V) in the absence oftargeting proteins. Occurrence of a mutation like TnsC^(A225V) in naturewould decrease the specificity and increase the frequency of insertions,the consequence of which could quite possibly be more insertions intoessential genes.

[0262] It is conceivable that TnsC has always played the primary role indirecting the TnsAB transposase to insert, and the targeting proteinsare more accessory. The inventor has envisioned TnsD binding DNA nearthe attachment site, and TnsC acting as an activation bridge to thetransposase, but an alternative view is that the ability of TnsC to bindDNA plays a more central role in directing the donor complex to aninsertion site, and TnsD has the role of “steering” a TnsAB+TnsC complexto a particular point of insertion. The A225V point mutation couldconfer the ability for TnsC to “steer” the donor complex to insertwithout the aid of a target-binding protein.

[0263] There is No Apparent Sequence Preference at the Point ofInsertion

[0264] Two main approaches have been taken herein to analyze theTnsC^(A225V)-mediated insertions at nucleotide resolution. The firstinvolves scanning along a short segment of DNA using PCR and highresolution denaturing gel analysis, quantitating specific signals ateach base in a processive manner, and attempting to flush out a sequencemotif common to those with the highest signals or lowest signals. Thesecond method focuses on the recovery of the more frequent insertionsonly, those that can be recovered by simply transforming thetransposition products, and relying on the host to conduct a successfulrepair of the replicons. These two methods provide different views of acommon process. Since recovery of specific insertions is reliant of theprocess of transformation, rare insertion events that can be visualizedby the PCR/denaturing gel method will most likely be severelyunderrepresented in a population of recovered transformants, if weassume that a higher concentration of a specific template will give riseto a diagnostic PCR product of higher intensity. This should bias therepresentative data accumulated from transposition producttransformations to overlap with the subset of PCR products analyzed bydenaturing gel analysis with the highest band intensities. In this way,both types of data are valid for attempting to determine a preferredinsertion site.

[0265] The inventor's search for a common insertion site motif failed touncover any preferred single nucleotides or groups of nucleotides thatshowed a higher incidence amongst the most intense signals on adenaturing gel or amongst the insertions isolated by transformation.Similarly, there were no apparent motifs common amongst the leastpreferred sites analyzed in the denaturing gel analysis. The lack of asequence preference for insertions with this reaction is a verydesirable result if it is to be employed as a highly nonspecific methodfor mutagenizing DNAs.

[0266] TnsC^(A225V): A Tool for in vitro Mutagenesis

[0267] The impressive efficiency and low specificity of theTnsABC^(A225V) in vitro reaction makes the reaction an excellent toolfor in vitro mutagenesis. The high efficiency of the reaction (i.e., thehigh percentage conversion of donor substrate to double-ended simpleinsertions) is critical when considering how the recombinant DNAs willbe recovered. The observation that the majority of the moleculesresulting from a reaction that contain a junction between the donor DNAand the target DNA are double-ended simple insertions provides anadvantage over alternative transposon-based insertional mutagenesissystems because large portions of the junctions seen in these reactionscan be single-end joins (Rowland, S. J. et al. EMBO J. 14:196-205(1995)). This study has demonstrated that standard commerciallyavailable E. coli competent cells are capable of repairing thecharacteristic gapped molecules formed as a result of a Tn7 simpleinsertion, provided the target DNA contains an origin capable ofreplicating in E. coli. Thousands of isolates can be recovered from asingle transposition reaction starting with sub-microgram quantities ofdonor and target DNA. High efficiency cells should yield even greaternumbers of isolates. Tn7 insertional mutants could be recovered frommany different organisms as long as the target DNA carries informationrequired to replicate in its respective host, the gaps can be repairedby the host, and DNAs can be reintroduced into the host with reasonableefficiency.

[0268] Cosmid clones have been successfully mutagenized and recovered bythe method just described. Pilot reactions were done using purifiedcosmid clones. But it would be very simple to mutagenize an entirecosmid library and select for mutants by the same process. Replicons aslarge as 125 kb (a BAC, data not shown) have been successfully targetedand recovered. An earlier study of the inventor demonstrated that theability of transposition machinery to recognize whether or not apotential target molecule already contains an insertion breaks down asthe distance between two insertion sites increases (52). It has beenshown that the degree to which a target molecule is “immune” to a secondinsertion has an inverse relationship to the length of separation of thesites of insertion. TnsC^(A225V) has demonstrated a sensitivity toimmunity signals. To date, the inventor has seen very few examples ofdouble insertions into plasmids in the 40-50 kb range, suggesting thatthis tool will be highly effective for mutagenizing cosmids or plasmidsin the 1-50 kb range.

EXAMPLE 3

[0269] A Kit for Making Transposon Insertions

[0270] The kit provides transposon insertions into DNA in vitro. Theseinsertions can be used to provide priming sites for DNA sequencedetermination, or to provide mutations suitable for genetic analysis, orboth.

[0271] Section A: Reaction Constituents

[0272] A1) PROTEINS

[0273] TnsA 30 μg/ml in 10% glycerol

[0274] TnsB 20 μg/ml in 25% glycerol

[0275] TnsC₁₂₇ 100 μg/ml in 10% glycerol

[0276] Proteins were kept at −70° C. A2) BUFFER CONSTITUENTS HEPES 0.25M pH 8.1 Tris[C1] 0.25 M pH 7.6 [can be omitted] BSA   10 mg/ml tRNA  50 μg/ml [can be omitted] DTT   1 M ATP  100 mM MgAcetate  375 mM A3)TRANSPOSON DONOR  100 μg/ml PLASMID

[0277] The essential features of the plasmid are described above ascontaining the R6K conditional replicon.

[0278] A4) CONTROL TARGET PLASMID

[0279] pLITMUS28 400 μg/ml

[0280] This plasmid contains both pUc and Mi3 origins, a lacZ′ MCS andamp. See the Figure legend for FIG. 10B.

[0281] (New England BioLabs, 32 Tozer Road, Beverly, Mass., 01915) A5)SEQUENCING PRIMERS NLC94 (SEQ ID NO: 13) 3 pmol/μl NLC95 (SEQ ID NO: 7)3 pmol/μl

[0282] Section B: (Can be Supplied by User)

[0283] B1) FOR THE REACTION in vitro

[0284] water; Millicue or equivalent recommended

[0285] Target DNA not carrying Kanamycin resistance (0.4-0.5 μg perreaction)

[0286] Water bath or heat block, 30° C.

[0287] 1.5 ml microtubes or other vessel; one per reaction.

[0288] B2) FOR STOPPING THE REACTION

[0289] when using chemically competent cells

[0290] Water bath or heat block, 75° C. Note: not 65° C.

[0291] when using electocompetent cells

[0292] Distilled phenol equilibrated with TE or Tris pH 8.0

[0293] Chloroform equilibrated with TE or Tris pH 8.0

[0294] EtOH for precipitation

[0295] NaAcetate 3 M

[0296] Water or 1 mM Tris pH 8 or TE

[0297] B3) FOR RECOVERING INSERTIONS:

[0298] B3a) Transformable Cells:

[0299] Any standard E. coli strain can be used; we have used ER1821,ER2502 and MC1061 (New England BioLabs, 32 Tozer Road, Beverly, Mass.,01915).

[0300] Any kanamycin-sensitive organism in which npt can be expressedcan also be used with the KanR donor, including but not limited to,Salmonella, other enteric organisms, Haemophilus, Rhizobium, andBacillus. With a suitably altered selectable marker on the transposondonor plasmid, any prokaryotic or eukaryotic organism into whichexogenous DNA can be introduced, may be used to recover insertions.

[0301] In this example,

[0302] B3ai) Chemically competent ER1821 New England BioLabs, 32 TozerRoad, Beverly, Mass., 01915 (2×10⁷transformants/μg of LITMUS or similarplasmid) was used. A sample protocol for preparing these is providedbelow, section D1

[0303] In example 2 we show the use of

[0304] B3aii) Electrocompetent MC1061 (ATCC#53338)

[0305] (7×10⁹ transformants/μg of pLITMUS-28 or similar plasmid). Asample protocol for preparing these is provided below, section D2

[0306] Commercially available competent or electrocompetent cells mayalso be used. The method of determining competence of these preparationsis provided below, section D3.

[0307] B3b) Outgrowth media:

[0308] Rich Broth (D4a below) or mSOC (D4d) without drug, or equivalent.

[0309] 0.4 ml per reaction; we recommend three reactions as a standardpilot experiment (see Section C below).

[0310] B3c) Selective media:

[0311] Rich Agar with drug (D4b), or equivalent

[0312] at least 1 plate per reaction; the standard pilot experimentdescribed in section C require 6 plates, three with two drugs and threewith one drug.

[0313] Kanamycin is REQUIRED to select for the transposon of the presentexample

[0314] Ampicillin is used for the RECOMMENDED positive control.Carbenicillin can be substituted

[0315] For the example of Section C, below, RB Kan Amp (3 plates) and RBAmp only (3 plates) are used. If the target plasmid carries some otherdrug resistance, the experimental reaction in the pilot experimentshould be plated on Kanamycin plus that drug.

[0316] B4) FOR DNA PREPARATION FOR SEQUENCING (see example 2):

[0317] Any standard procedure that ordinarily gives sequencing gradeDNA. We have tested Qiagen spin columns and gravity flow plasmidpreparations.

[0318] Section C. Tn7 in vitro Transposition Reaction Protocol

[0319] C1. REACTION VOLUME=100 μl

[0320] C2. RECOMMENDED PILOT EXPERIMENT 3 samples to be carried through.Tube 1 Experimental (Target DNA, protein and donor plasmid added) Tube 2Reaction positive control (pLITMUS28, protein and donor plasmid added)Tube 3 Reaction negative control (Target DNA added, no protein, donoradded) Tube 2 is also used as a transformation positive control

[0321] In this example, all tubes have pLITMUS28 as target (tubes 1 and2 are duplicates). Tube 2 need not necessarily be included in everyexperiment.

[0322] C3. MAKE UP a mix using reagents of Section A:

[0323] per reaction:  10 μl Hepes (250 mM pH 8.1)   1 μl Tris (250 mM pH7.6) 0.5 μl BSA (10 mg/ml) 2.1 μl tRNA (50 μg/ml) 0.2 μl DTT (1 M)   2μl ATP (100 mM)

[0324] C4. DISPENSE mix of step 3 to each tube (89.7 μl)-(volume oftarget DNA)/reaction; in this example, this is 88.7 μl.

[0325] C5. ADD target DNA of section B (0.4 μg) to tubes 1-3. In thisexample, this is pLITMUS28, 1 μl. This works well for plasmid targets.For cosmids, 0.5 μg worked well when the cosmid was around 10 times thesize of the donor (5.2 kb) i.e. a molar ratio of around 2:1 (donor totarget). Increasing the ratio to 4:1 decreased the efficiency slightly.C6. ADD to each tube Tube 1 Tube 2 Tube 3 TnsA 1.3 μl 1.3 μl (40 ng) 0TnsB   3 μl   3 μl (20 ng) 0 TnsC₁₂₇   1 μl   1 μl (100 ng) 0 dH2O   0  0 5.3 μl

[0326] C7. ADD 1 μl donor DNA (0.1 g pMCB40). Mix well by pipetting upand down a few times. Tube 1 Tube 2 Tube 3 Donor pMCB40 1 μl 1 μl 1 μl

[0327] C8. INCUBATE 10 minutes at 30° C. (assembly reaction)

[0328] C9. ADD 4 μl MgAc (375 mM) to each tube. Mix well by pipetting upand down a few times Tube 1 Tube 2 Tube 3 MgAc 4 μl 4 μl 4 μl

[0329] C10. INCUBATE 1 hour 30° C. (transposition reaction)

[0330] C11. HEAT INACTIVATE 75° C. 10 minutes. Note: 65° C. is notadequate.

[0331] C12. TRANSFORM using chemically competent cells (see procedure ofsection D1):

[0332] a. Add 10 μl of the reaction mix to 100 μl competent cells thawedon ice.

[0333] b. Incubate 1 h on ice.

[0334] c. Heat at 37° C. for 45 sec.

[0335] d. Chill on ice 2 min.

[0336] e. Dilute the transformation mix into 0.4 ml RB (total volume 0.5ml).

[0337] f. incubate 40 min at 37° C.

[0338] g. plate 100 μl tubes 1-3 on Kanamycin-containing selectivemedia.

[0339] h. plate dilutions of tube 2 on medium selective for the targetplasmid only: dilute 100 fold (10 μl/1 ml) and 1000-fold (1 μl/1 ml) andplate 100 μl of undiluted and of each dilution (3 plates)

[0340] In this example, selective medium was RB Kan (20 μg/ml) Amp (100μg/ml) (tubes 1-3) and RB Amp (100 μg/ml, tube 2). Competent cells wereER1821, chemically competent (Section D1).

[0341] C13. Transformation result:

[0342] On Kan Amp:

[0343] Tube 1 285 colonies

[0344] Tube 2 600 colonies

[0345] Tube 3 0 colonies

[0346] On Amp only:

[0347] Tube 2 confluent (undiluted)

[0348] Section D: Recipes and Auxiliary Procedures

[0349] D1) Chemically competent cells (E. coli):

[0350] a. Inoculate a single colony from an RB agar plate (see D4b) into2 ml of RB (D4a) in a plating tube. Shake overnight at 37° C.

[0351] b. Subculture the overnight 1:100 in 1 Volume Unit of RB+20 mMMgSO₄ (typically 250 ml). Grow to OD₅₉₀=0.4-0.6 or Klett=60 (˜2-3 h).

[0352] c. Centrifuge 5,000 rpm 5 min at 4° C.

[0353] d. Gently resuspend pellet in {fraction (1/2.5)} Volume Unit icecold TFBI (see below, D4f). Keep all steps on ice and chill all pipets,tubes, flasks, etc. from this point on.

[0354] e. Incubate on ice for 5 min.

[0355] f. Centrifuge 5,000 rpm 5 min 4° C.

[0356] g. Gently resuspend pellet in {fraction (1/25)} original volumecold TFB2 (D4g). For 250 ml of original subculture, use 10 ml TFB2.

[0357] h. Incubate on ice 15-60 min. before aliquoting 100 μl/tube forstorage at −70° C. Quick-freeze the tubes.

[0358] i. To transform, thaw an aliquot on ice; add DNA; incubate 1 h onice; heat shock 45 seconds at 37° C.; incubate on ice 2 min; dilute5-fold into RB with no drug (for phenotypic expression); grow withvigorous aeration at 37° C. for 20 min.; plate on selective medium.

[0359] This procedure works with most strains and should routinely give>10⁷ cfu/μg of pLITMUS28 (using 0.1 ng/transformation). Frozen cellslast at least a year.

[0360] D2) Electrocompetent cells (E. coli)

[0361] D2a. Rationale and Comments

[0362] This procedure prepares cells for use in gene transfer employingan electroporator device such as that supplied by BioRad. DNA isintroduced into cells by means of an electric field.

[0363] Successful electroporation requires a low electrolyteconcentration, to avoid arcing (and cell killing) in the device. Cellsare grown to midexponential phase, washed extensively in distilled waterand sterile 10% glycerol, concentrated 500-fold in glycerol, aliquotedand stored at −70° C.

[0364] Any strain can be used for this purpose, although some strainsare said to give larger numbers of transformants. Resuspended cellsshould be well-dispersed for best results. Some strains resuspend moreevenly in the low electrolyte solutions; some lyse under theseconditions with rough treatment.

[0365] The electroporation procedure itself involves transfer of thethawed cells to an electroporation cuvette (which has leads that contactthe device appropriately), addition of DNA, imposition of the electricfield, recovery from this treatment (by incubation in broth), andplating selectively.

[0366] Efficiency of transformation with this method is 100-500 foldgreater than with standard transformation. It is therefore especiallysuitable when low transformation efficiency is expected or large numbersof transformants are desired. The method is said to be especiallysuitable for introduction of large DNA molecules.

[0367] D2b. Preparation of Electrocompetent E. coli Cells (from BioRadRecommended Procedure)

[0368] i. Materials for 2 ml of electrocompetent cells (20 aliquots, 100μl): overnight culture of desired strain 1 ml (in Rich Broth (D4a) orLuria Broth (D4c)) Luria Broth (D4c) 1 L dH₂O, sterile, 4° C. or 0° C.1.5 L 10% (w/v) glycerol, sterile (D4h) 22 ml 1 L sidearm flasks 2 250ml centrifuge bottles 6 50 ml Oak Ridge centrifuge tubes 2 1.5 mlmicrotubes, polypropylene 20 Pipet tips (sterile) for P200 or equivalent20 Sterile glass or plastic pipets, 25 ml 3 Klett-Summerson colorimeterHigh speed centrifuge (e.g. Beckman J21) Micropipetter, e.g. GilsonPipetman P200 Water bath rack that can be used to immerse tubes inliquid nitrogen. Liquid nitrogen bath for quick freezing

[0369] ii. Procedure for making electrocompetent cells

[0370] Be sure the sterile dH₂O and 10% glycerol is cold.

[0371] If necessary, distribute the Luria Broth to sidearm flasks, 500ml/flask

[0372] Inoculate each flask with 0.5 ml of the overnight culture

[0373] Incubate with shaking until Klett=90 (5×10⁸ cfu/ml). Quickconversion if Klett is not available: 1 OD=150 Klett Units; 10⁹cells/1.1 OD)

[0374] Chill on ice with swirling, until cold. It is very important tokeep everything cold from this point on.

[0375] Transfer to centrifuge bottles, 167 ml/bottle or as desired.

[0376] Centrifuge 4,000 rpm 15 min 5° C. in JA14 rotor in Beckman.Decant supernatant.

[0377] Resuspend gently in equal volume (1 L total) cold steriledistilled water. Keep in an ice bath while resuspending. Repeatedpipetting will help; chill pipets for this use.

[0378] MC1061 cells (ER1709) can be kept on ice at this stage for atleast an hour

[0379] Centrifuge 4,000 rpm 15 min 5° C. in JA14 rotor in Beckman;decant supernatant.

[0380] Resuspend gently in ½ volume cold sterile distilled water (0.5 Ltotal). Keep in an ice bath while resuspending. Cells can now becombined into three bottles if desired.

[0381] Centrifuge 4,000 rpm 15 min 5° C. in JA14 rotor in Beckman.Decant supernatant.

[0382] Resuspend in {fraction (1/50)}th volume cold sterile 10% glycerol(20 ml total). Keep cold while resuspending.

[0383] Transfer entire amount to a 50 ml Oak Ridge tube (35 mlcapacity).

[0384] Centrifuge 4,000 rpm 15 min 5° C. in JA17 rotor in Beckman, withbalance tube. Decant supernatant

[0385] Resuspend in {fraction (1/500)}th volume (2 ml total) cold 10%glycerol. Keep cold.

[0386] Distribute 100 μl/tube to microtubes in ice water bath rack;immerse rack in liquid N₂; transfer to box; store at −70° C.

[0387] D2c. Procedure for Electroporation of Poratable E. coli Cells(from BioRad Recommended Procedure) D2ci. Materials (per electroporationreaction) Electrocompetent cells 100 μl 18 × 150 mm culture tubes 1Electroporation cuvettes (BioRad cat# 1652086 or equivalent) 1 mSOC (seeD4d) 1 ml Pasteur pipets, sterile 1

[0388] DNA to be transformed; in low ionic strength medium, e.g. dH₂O orTE (see D4i).

[0389] Electroporator (BioRad Gene Pulser or equivalent)

[0390] Ice bath trays for cuvettes and outgrowth tubes

[0391] Rollordrum in 37° C. incubator or other means of incubatingculture tubes

[0392] Selective agar plates and plating materials

[0393] 37° C. or suitable temperature incubator

[0394] D2cii. Procedure

[0395] Be sure all materials are set up ready to go before getting cellsout of the freezer. The DNA must be added and the electorporation doneas soon as the cells are thawed; cells will lyse after a short time,resulting in arcing as the medium becomes more conductive.

[0396] Chill cuvettes and hold on ice (>5 min). Transformationefficiency declines at least 100-fold if cuvettes are at roomtemperature

[0397] Set BioRad Gene Pulser to 25° F. capacitance, 2.5 kV, and thepulse controller to 200 Ω (maximum voltage)

[0398] Thaw electrocompetent cells at room temperature and transfer toice.

[0399] In a cuvette mix 40 μl cells with 0.4 μg-0.3 μg DNA. Shake thesuspension to the bottom of the cuvette, rap on table to shake loose airbubbles.

[0400] Place the cuvette in the holder

[0401] Apply one pulse by pushing both red buttons until a beep isheard. This will result in a pulse of 125 kV/cm with a time constant of4-5 sec.

[0402] Immediately add 1 ml mSOC to the cuvette and gently but quicklyresuspend the cells.

[0403] A P1000 with sterile blue tips or sterile pasteur pipets can beused for this. A 1 min delay in adding the medium results in 3 folddecrease in transformation efficiency.

[0404] Transfer cells to culture tube.

[0405] Incubate 37° C. 1 hour

[0406] Plate on selective media.

[0407] D3) Standardization of Transformation or Electroporation

[0408] D3a. Rationale and Comments

[0409] To ensure that gene transfer is successful, we recommend that thecells prepared above (D1 or D2) or purchased commercially be transformedwith a standard DNA dilution series before experimental use. Below is anexample of such a standardization for electrocompetent cells (D2).Chemically competent cells will yield 100-500 fold fewer transformants,so dilutions given below should be appropriately adjusted.

[0410] D3b. Materials for a Standardization Experiment

[0411] Dilutions of standard DNA, usually a high-copy small plasmid(e.g. LITMUS28), in TE: A  1 ng/μl B  10 pg/μl C  1 pg/μl D 100 fg/μlSelective agar plates; RB 1.5% Amp 100 μg/ml for pLITMUS28 12 Dilutionmedium, usually 0.85% saline 7 ml Dilution tubes, usually 13 × 100 mm 7Sterile plastic or glass pipets, 0.1 ml 10 Sterile plastic or glasspipets, 0.2 ml 1 Sterile plastic or glass pipets, 1 ml 1 Micropipetters,e.g. P200 and P20 or P10, for DNA transfer and dilution

[0412] series

[0413] Pipet tips for P200 and P20 or P10

[0414] Spreader

[0415] Ethanol or isopropanol for flaming the spreader

[0416] 37° C. incubator

[0417] D3c. Procedure for Standardization Experiment

[0418] D3ci. Set up dilution tubes below and label plates beforehand orwhile cultures are growing out.

[0419] D3cii. Carry out electroporation as above (D2) with DNA dilutionsA-D D3ciii. Place cultures on ice to prevent further growth while makingdilutions and plating as below.

[0420] D3civ. Dilute in saline: Sample A 10⁻¹ , 10⁻² , 10⁻³ , 10⁻⁴Sample B 10⁻¹, 10⁻² Sample C 10⁻¹ Sample D no dilutions

[0421] This can be carried out as:

[0422] 10⁻¹ dilution: 100 μl sample+900 μl saline

[0423] 10⁻² dilution: 10 μl sample+1 ml saline

[0424] 10⁻³ dilution: 10 μl 10⁻¹ dilution+1 ml saline

[0425] 10⁻⁴ dilution: 10 μl 10⁻² dilution+1 ml saline.

[0426] D3cv. Plate on selective media by spreading; flame the spreaderafter each plate: Dilutions: undiluted 10⁻¹ 10⁻² 10⁻³ 10⁻⁴ Samples: A0.1 ml 0.1 ml 0.1 ml 0.1 ml B 0.1 ml 0.1 ml 0.1 ml C 0.1 ml 0.1 ml D 0.1ml 0.2 ml 0.5 ml

[0427] D3vi. Example of result: DNA Dilution/vol Transformants Sampleadded plated Colonies per ml per μg A 1 ng 1/0.1 Confluent 2/0.1 verynumerous 3/0.1 ˜1000  4/0.1  71 7 × 10⁶   7 × 10⁹  B 10 pg  0/0.1 verynumerous 1/0.1 405 2/0.1  49 4 × 10⁴   4 × 10⁹  C 1 pg 0/0.1 verynumerous 1/0.1 106 1 × 10⁴ 1.1 × 10¹⁰ D 100 fg  0/0.1 ˜500  0/0.2 1730/0.5  75 8 × 10²   8 × 10⁹  Average transformants/μg 7.6 × 10⁹ D4)Recipes Bacteriological D4a) RB, per liter Tryptone (Difco) 10 g  YeastExtract (Difco) 5 g NaCl 5 g NaOH (1 N)  2 ml Autoclave D4b) RB Agarwith drug, per liter Tryptone (Difco) 10 g  Yeast Extract (Difco) 5 gNaCl (Baker) 5 g NaOH (1 N)  2 ml Agar (Difco) 15 g  Autoclave Drugs:add after autoclaving and cooling to 55° C., per liter: Kanamycin(REQUIRED)  20 mg Other drugs that MAY be added, per liter; choicedepends on target plasmid: Ampicillin or carbenicillin  100 mg Chloramphenicol  15 mg Tetracycline  15 mg Others drugs not tested butpresumably usable in an appropriate host strain: SpectinomycinStreptomycin Gentamycin Erythromycin Rifampicin (recessive marker)Bleomycin Other antibacterial small molecules D4c) Luria Broth, perliter Tryptone 10 g  Yeast extract 5 g NaCl 10 g  MgCl2 · 6H₂O 1 gglucose 1 g Aliquot and autoclave. For preparing electrocompetent cells(C2) it is convenient to aliquot 500 ml/flask in 1 L sidearm flasksbefore autoclaving. D4d) mSOC, per liter (modified from BioRad recipe)Luria Broth   1 L  MgSO₄, 1 M sterile  10 ml 40% glucose, sterile 6.5ml  Add MgSO₄ and glucose sterilely to sterile Luria Broth D4e) 0.85%saline, per liter NaCl 8.5 g  Distribute in suitable aliquots,autoclave. Buffers and storage media D4f). TFBI  30 mM KOAc (potassiumacetate) 100 mM RbCl  10 mM CaCl₂  50 mM MnCl₂  15% glycerol Adjust topH 5.8 with acetic acid and filter to sterilize. It is convenient tomake this as:   5 g RbCl (Alfa) 12.3 ml KOAc 1 M  4.1 ml CaCl₂ 1 M 20.5ml MnCl₂ 1 M (this is pink) 61.5 g glycerol; pH to 5.8 with ≦ 8 ml HOAc0.1 M make up to 410 ml; distribute in 100 ml sterile aliquots; and use1 aliquot/250 ml culture. D4g). TFBII 10 mM MOPS 75 mM CaCl₂ 10 mM RbCl15% glycerol Adjust pH to 6.5 with KOM and filter to sterilize Make upas  1.5 ml MOPS 1 M pH 6.5 (this is yellow) 11.25 ml CaCl₂ 1 M  1.5 mlRbCl 1 M  22.5 g glycerol pH with 1 N KOH; make to 150 ml, filter; use10 ml per original 250 ml culture. D4h) 10% glycerol, per liter Glycerol100 g  dH₂O  1 L  Aliquot and autoclave D4i) TE, per liter 1 M Tris pH8.0 10 ml 0.5 M EDTA pH 8.0  2 ml

EXAMPLE 4 Random Insertion of Primers for Sequencing

[0428] Section A: Components used for Transposition Reaction

[0429] A1) PROTEINS A1) PROTEINS TnsA 40 μg/mL in 10% glycerol TnsB 20μg/mL in 50% glycerol TnsC₁₂₇ 100 μg/mL in 50% glycerol Stored at −70°C. A2) BUFFER CONSTITUENTS HEPES 0.25 M pH 8.1 Tris[Cl] 0.25 M pH 7.6[can be omitted] BSA 10 mg/ml tRNA 50 μg/ml DTT 1 M ATP pH7 100 mMMgAcetate 375 mM TnsD storage buffer TnsD is stored in the followingbuffer: 3.3 μl 500 mM KCl, 50 mM Tris-HCl (pH 8.0), 1 mM EDTA, 2 mM DTTand 25% glycerol A3) TRANSPOSON DONOR PLASMID pEM delta R.adj to 1 50μg/ml (Sequence appears in FIG. 9B and SEQ ID NO:3)) A4) TARGETPLASMID 1) pER183 mini-cleared lysate 200 μg/ml 2) pER183 CsClpreparation 400 μg/ml 3) pRM2 400 μg/ml (Sequence of pER183 appears inFIG. 10A and SEQ ID NO:5)

[0430] Section B: Components Used for Processing Reaction

[0431] Phenol/chloroform equilibrated with TE

[0432] Phenol equilibrated with Tris pH 8.0

[0433] NaAcetate 3 M

[0434] Ethanol (EtOH)

[0435] BstEII New England BioLabs, 32 Tozer Road, Beverly, Mass., 01915

[0436] DNA Polymerase I Holoenzyme New England BioLabs, 32 Tozer Road,Beverly, Mass., 01915

[0437] T4 DNA Ligase New England BioLabs, 32 Tozer Road, Beverly, Mass.,01915

[0438] 10× Fi/L buffer (section 13)

[0439] 10× Buffer 3 (NEB#007-3) New England BioLabs, 32 Tozer Road,Beverly, Mass., 01915

[0440] tRNA 1 mg/ml

[0441] DNA buffer (section I2)

[0442] TE (section 14)

[0443] Section C: Components Used for Recovery of Insertions

[0444] MC1061 electrocompetent cells (made and used as in Example 3, D2and D3) Selective media (made and used as in Example 3, D3 and D4)

[0445] Section D: Components Used for Sequence Determination D1)SEQUENCING PRIMERS NLC94                3.2 pmol/μl. Sequence of thisprimer (SEQ ID NO:13): 5′AAAGTCCAGTATGCTTTTTCACAGCATAACNLC95                3.2 pmol/μl Sequence of this primer (SEQ ID NO:7)5′ATAATCCTTAAAAACTCCATTTCCACCCCT D2) QIAPREP SPIN MINIPREP KIT (QiagenCat #27106) D3) ABI Sequencer (info) and reagents Section E: in vitrotransposition protocol E1) MAKEUP Mix: 208.2 μl dH₂O 30 μl Hepes (25O mMpH 8.1) 3 μl Tris (250 mM pH 7.6) 1.5 μl BSA (10 mg/ml) 6.3 μl tRNA (50μg/ml) 0.6 μl DTT (1M) 6 μl ATP (100 mM) E2) DISPENSE 85.2 μl to threetubes E3) ADD target DNA of A4, 2 μl E4) ADD to each tube Tube 1 Tube 2Tube 3 TnsA 1.3 μl 1.3 μl 1.3 μl TnsB 1 μl 1 μl 1 μl TnsC₁₂₇ 1 μl 1 μl 1μl D buffer 3.3 3.3 3.3 μl Donor 2 2 2 μl E5) INCUBATE 30 minutes at 30°C. (assembly reaction) E6) ADD 4 μl MgAc (375mM) to each tube. E7)INCUBATE 1 hour at 30° C. (insertion)

[0446] Section F: Reaction Processing

[0447] In this example, the transposon donor was capable of replicatingin the host used for recovery of insertions. Transformation of thereaction mixture on plates selecting for the transposon and the targetmarkers might well result in many colonies with two different plasmids,rather than with a single plasmid containing both markers. For thisreason, we digested the reaction with a restriction endonucleasecleaving in the donor replicon but not within the transposon or in thetarget DNA. In addition, we examined the consequences of repairing thestrands not ligated by the transposition reaction, using DNA polymeraseI holoenzyme and ligase.

[0448] Per reaction (100 μl):

[0449] PC extract:

[0450] Add 100 μl phenol/chloroform, vortex

[0451] Centrifuge 5′ in microfuge

[0452] Backextract

[0453] Remove organic phase to a new tube with 100 μl TE; vortex

[0454] Centrifuge 5′ in microfuge

[0455] Combine aqueous phases (185 μl total)

[0456] EtOH precipitate

[0457] 20 μl 3M NaAc

[0458] 500 μl EtOH

[0459] chill on dry ice

[0460] Centrifuge 5 min in microfuge

[0461] Drain supernatant, air dry

[0462] Resuspend in 100 μl DNA buffer Divide each reaction for furthertreatment (all volumes are μl) Repair Digestion Diget Treatment: A B 1)Repair/ligation DNA 40 40 10X Fi/L 5 — dH₂O 2 — Pol I (10,000 μ/ml) 2 —a) Incubate 15 min room temperature Ligase (400,000 u/ml) 1 — b)Incubate 4 h 16° C. 2) Digestion 1 M NaCl 6.0 — 10 X buffer 3 6.0 BstEII(10,000 u/ml) 1 1 Incubate 60° C. 1 h 3) Protein removal, bufferexchange 1 Phenol, equilibrated 50 50 a) Mix, centrifuge 5′ in microfugeb) Back extract organic phase with DNA buffer c) Combine aqueous phaseTotal volume, step 3c 100 100 3 M NaAc 10 10 tRNA 1 mg/ml 1 1 EtOH 120120 a) Incubate 5 min room temperature b) Centrifuge, discardsupernatant c) Wash twice with cold 70% EtOH (100 μl) DNA buffer 50 50d) Resuspend Final volume, step 3d 50 50 4) Buffer exchange 2Re-precipitation DNA from step 3d 35 35 3 M NaAc 5 5 EtOH 137.5 137.5 a)Incubate −70° C. overnight b) Centrifuge, discard supernatant c) washtwice 200 μl 70% EtOH TE 50 50 d) Resuspend

[0463] Section G: Recovery of Insertions

[0464] Electroporated 10 μl of samples into MC1061 following procedureof Example 3, section D3 TABLE 2 Sample codes, treatments, and targetconcentrations corrected for losses during manipulation Target Target[Target DNA] Name Treatment Selection (fmol/μl) 1A pER183 Fi/L, Dig Cam0.015 1B pER183 Digested Cam 0.05 2A pER183 Fi/L, Dig Cam 0.98 2B pER183Digested Cam 0.42 3A pRM2 Fi/L, Dig Amp 0.66 3B pRM2 Digested Amp 0.56

[0465] TABLE 3 Colony forming units per ml on appropriate selectiveplates Sample 1A 1B 2A 2B 3A 3B Donor (or recomb) Kan Only 130 1.8 × 10⁵5 × 10³   7 × 10³ 3.7 × 10⁴ 4 × 10⁴ Recipient Cam only 1 × 10⁴   8 × 10⁵4 × 10⁴   6 × 10⁶ Amp only   3 × 10⁷ 4 × 10⁷ Colonies/fmol 6 × 10⁴ 1.6 ×10⁶ 4 × 10³ 1.4 × 10⁶ 4.5 × 10⁶ 7 × 10⁶ Recombinant Kan Cam  16 2.7 ×10³ 880   1 × 10⁴ Kan Amp 1.1 × 10⁵ 4 × 10⁴ Recomb/recip  1 × 10⁻³   3 ×10⁻³  2 × 10⁻²  1 × 10⁻³   4 × 10⁻³  1 × 10⁻³

[0466] 75 recombinant colonies were chosen, 31 from samples 2A, 44 fromsamples 2B, for further characterization

[0467] H. Determination of Sequence Location.

[0468] 1. Procedure Summary

[0469] 75 recombinant colonies were picked into 0.5 ml RB in rackedarray for storage. Subcultures of these storage cultures were grown withselection (RB Cam Kan), and minipreps made according to the directionsof the manufacturer for large plasmids of low copy number.

[0470] DNA concentration of the plasmid preps was determined bycomparison with a dilution series of linearized pLITMUS28 on agarosegels. Plasmid preps were linearized for this purpose with an enzyme thatcleaves once in the target plasmid and not in the transposon (SacII).

[0471] Primers NLC94 (SEQ ID NO:13) and NLC95 (SEQ ID NO:7) were usedfor sequence determination, using flourescently-labeleddideoxynucleotide sequencing reagents from Applied Biosystems.

[0472] Sequences were run on an ABI sequencer, and sequence acquisition,editing and assembly was carried out with the supplied programs (SEQED,FACTURA and AUTOASSEMBLE).

[0473] Output is FIG. 11

[0474] 2. Results

[0475] a. Table 4: Summary result of 75 recombinants (CamR KanRcolonies), potential Tn7 insertions into pER183. Total DNA preps 75 DNAconcentration too low to attempt sequence: 7 Transformant contained twoplasmids, not sequenced: 1 Total not sequenced 8 DNA preps sequenceattempted 67 Sequence unreadable (miscellaneous reasons) 2 Sequenceunreadable because 2 insertions in one plasmid 1 Total sequenceunreadable 3 DNA preps sequence obtained 64 Sequence rejected (crosscontamination of adjacent wells) 1 Total insertions rejected 1Independent insertions for which location was obtained 63 Number ofinsertion locations 62 Number inserted clockwise 33 Number insertedcounterclockwise 30 Aberrant insertions Number of insertion plasmidswith structural aberrations 1 This was a deletion far from the insertionNumber of structural aberrations associated with insertion site 0 Numberof insertions with disagreement in 5 bp duplication 2

[0476] These were:

[0477] G→A transition mutation in one copy with respect to targetplasmid sequence

[0478] G→T transversion mutation in one copy with respect to targetplasmid sequence.

[0479] b. Analysis of the distribution of insertions among sequences andintervals. For the purpose of obtaining maximum sequence from an unknowntarget, it is desirable that the insertions be distributed as randomlyas possible with respect to regions of sequence and with respect tospecific sequences. The summary of Table 4 already suggests a veryrandom process, since 63 independent insertions hit 62 differentlocations, i.e. no hotspots for insertion were identified. Forcomparison, relaxed-specificity derivative of Tn10 (ATS2, examined within vivo insertions into the lac operon) hit 23 sites with 50 insertions.

[0480] Primary data for further analysis below is found in Table 5,which gives the location of al the insertions, their orientation withrespect to the target plasmid, and the sequence immediately adjacent tothe insertion (the five bp sequence duplicated by the insertionmechanism) in a uniform frame of reference. TABLE 5 Insertion locationsand associated 5 bp duplication Sequence at Directions positionorientation Sequence Insert relative to Tn7R (Tn7 R Isolate obtainedLocation name 1 2 3 4 5 clockwise = +) 1 1 6464 A5 A G C T C − 2 2 8428A6 C T G G T − 3 1 8349 A7 C C T G A + 4 2 5515 A8 T A A C T + 5 2 7822A9 C C C G C + 6 2 365 A10 T C A A C + 7 2 5695 A11 T C A C G − 9 2 2500B1 G G A T G + 10 1 8286 B2 C T T C C + 11 1 2764 B3 C T T T A + 12 26953 B4 C G A G G + 13 2 3414 B5 C T T T G + 14 1 3139 B6 T C G T T − 151 3208 B7 G C A C T − 16 2 4208 B8 A G A G C − 17 2 3671 B9 G T T T A +18 2 5563 B10 C C A A C − 19 1 3539 B11 G C T T C + 20 2 3803 B12 A T TC C − 21 2 8474 C1 C C G C C + 22 2 5661 C2 A T G A T + 23 2 7693 C3 C GC G T − 24 2 3205 C4 T C T T C − 25 2 1650 C5 C C T A T − 26 2 8020 C6 GC C G G − 27 2 2566 C7 A T T T T + 29 2 2275 C9 G C C C A + 30 1 6368C10 G C T A T − 32 2 2629 C12 T A T A C + 33 2 5988 D1 G G C G A + 34 23499 D2 A T G T A − 35 2 3933 D3 T T G A T − 36 2 6077 D4 G T T G T + 372 6756 D5 T T G A G − 38 2 5563 D6 G T T G G + 38 2 8224 D7 G G A G G −40 2 3123 D8 C A A A T − 41 1 2746 D9 A A A A C − 42 1 1646 D10 C G A GA + 43 1 5678 D11 A T G T G + 44 2 7406 D12 T G C A T + 45 2 1744 E1 G CC A T − 46 2 3584 E2 T A G G T + 47 2 2112 E3 C C T A C + 48 2 4205 E4 GC A G C − 49 1 2708 E5 G C G G T + 50 2 7828 E6 A C A G A + 52 2 3873 E8A G T C T − 53 2 3591 E9 C A T G C − 56 2 5550 E12 A T C G C − 57 2 2702F1 T T C A C + 61 2 4490 F5 G T T A A − 62 2 5811 F6 A C G C G + 63 22024 F7 A C T G T − 64 2 1479 F8 A T C G T − 66 2 5675 F10 T T T A T +67 2 5208 F11 A T A A A + 68 2 6020 F12 G G T A A + 69 2 6264 G1 G A G TA + 70 2 3881 G2 A T T T G − 72 2 2891 G4 A T T C G − 74 2 1681 G6 A C TC T − 76 2 5315 G7 A A T A C +

[0481] Table 5 Legend:

[0482] Isolate: Number of the colony

[0483] Directions sequenced: 1=only one direction from the insertion;2=both directions

[0484] Position: coordinate on pER183 (SEQ ID NO:5) top strand of thefirst base of the 5 bp duplication

[0485] Insert name: accession number in notebook

[0486] Sequence at position #: position 1 is the base immediatelyadjacent to Tn7R top strand (i.e. it can be either the top or the bottomstrand of pER183 (SEQ ID NO:5)); position #2 is the next but one toTn7R; and so forth.

[0487] Orientation: of the insertion relative to the top strand ofpER183 (SEQ ID NO:5). +, Tn7R is to the right of Tn7L when displayed onthe top strand of pER183 (SEQ ID NO:5). −, Tn7R is to the left of Tn7L.

[0488] 1. Distribution of insertions fits the Poisson distribution

[0489] 1. These insertions are randomly distributed as judged by the fitof the interval distribution to the distribution predicted by a Poissonprocess.

[0490] The Poisson distribution gives the probability of observingexactly X_(i) events (insertions) in a unit (interval) when the averagenumber of events per unit is μ (from Zar, J. H. Biostatistical AnalysisPrentice-Hall, Englewood Cliffs, N.J. 1974 p.301). $\begin{matrix}\frac{{P\left( X_{1} \right)} = {\mu^{Xi}^{- \mu}}}{X_{i}!} & {{eq}\quad 1}\end{matrix}$

[0491] Where

[0492] X_(i)=exactly X_(i) insertions per interval

[0493] μ=average number of insertions per interval

[0494] Let

[0495] X_(i)=number of insertions in a 100 bp interval

[0496] f_(i)=Observed number of 100 bp intervals with X_(i)insertions/interval

[0497] n=number of 100 bp intervals in the set (=73)

[0498] μ=Σf_(i)X_(i)/Σf_(i)=63/73

[0499] P_((xi))=probability of finding X_(i) insertions in a 100 bpinterval (from the Poisson distribution,. eq 1)

[0500] F_(i)=P_((xi))n=expected number of intervals with i insertions.

[0501] From the data in Table 5 and eq 1 we can construct the followingcomparison of expected and observed data: TABLE 6 Observed and expecteddistribution of insertions in 100 bp intervals Expected ObservedProbability of number of Insertions intervals with X₁ insertions perintervals with per interval X₁ insertions interval X₁ insertions X_(i)f_(i) P(X_(i)) F_(i) 0 34 0.42189 30.80 1 24 0.35410 26.58 2 9 0.1571111.47 3 3 0.04520 3.299 4 3 0.00975 0.712

[0502] These distributions are illustrated in FIG. 12, where fi=observeddistribution, Fi=expected distribution. The fit looks good to the eye.

[0503] b. Statistical test of fit between observed and expecteddistributions

[0504] To test whether the observed and expected distribution arestatistically indistinguishable, we used a Chi-square test for goodnessof fit (from Zar, J. H. Biostatistical Analysis Prentice-Hall, EnglewoodCliffs, N.J. 1974 p. 303). For this purpose we pool the tail of thedistribution so that no expected number is less than 4. Rewriting Table6, we obtain TABLE 7 Chi-square test of goodness of fit to a randomdistribution Expected number Insertions per Observed intervals ofintervals with X₁ (f_(i)F_(i))² Interval with X₁ insertions insertionsF_(i) X_(i) F_(i) F_(i) Chi-square 0 34 30.80 0.3329 1 24 26.58 0.2504 29 11.47 0.5315 ≧3    6 4.1541 0.8209 Sum 1.944

[0505] The null hypothesis is that the observed distribution was drawnfrom a Poisson distributed population. For two degrees of freedom thissum of chi-square values gives a probability that this is the case of0.25<p<0.5. The null hypothesis is not rejected.

[0506] In sum, the eye (part a, FIG. 12) and a statistical test (Table 7and following) agree that the distribution of insertions in intervalsalong the DNA is random.

[0507] ii. Analysis of the base composition of insertion sites.

[0508] Site Preference of TnsABC₁₂₇ for Insertion of miniTn7 into pER183

[0509] Certain bases are preferred at some positions in the five-baseinsertion site duplication, as shown in a histogram of base incidenceversus position in the site (FIG. 13), taken from the data in Table 5.In collating the data for this histogram, the five duplicated bases wereassigned position numbers relative to Tn7R; position one is the baseimmediately adjacent to Tn7R when the sequence is displayed with Tn7R onthe right and Tn7L on the left. The orientation of the transposonrelative to the target sequence during target choice is thus controlledfor: the target is displayed in the same way relative to the transposonfor all insertion sites.

[0510] A model for a preferred site was formulated: NYTRN. The elementsof this site were tested for statistical significance individually andcollectively by chi-square analysis (Table 8). The null hypothesis wasthat sites were drawn randomly from the universe of sequence defined bythe sequence of pER183 (SEQ ID NO.5) after deleting sequence subject toselection (bp 1-250 and 2481-2509, CamR; and 581-1400, replicationorigin). Expected frequencies of the four bases, of purines andpyrimidines, and of trinucleotides were derived from frequenciesobtained for pER183 (SEQ ID NO:5)-condensed by the GCG programCOMPOSITION. TABLE 8 Chi-square tests (tests that differ from randomexpectation (p<0.05) in bold) Base Expected Observed Chisquareprobability Four bases individually, all sites collectively (315 bpexperimental, 7410 bp control) A 78.4 73 .372 C 74.3 77 .981 G 76.2 72.232 T 85.7 93 .622 2.21 0.5<p<0.75 Four bases individually, eachposition individually (63 bp experimental, 7810 bp control) Position 1 A15.7 19 0.694 C 14.9 15 0.00066 G 15.2 17 0.213 T 17.1 12 1.52 2.430.25<p<0.5 Position 2 A 15.7 8 3.8 C 14.9 22 3.38 G 15.2 11 1.16 T 17.122 1.4 9.74 0.01<p<0.025 Position 3 A 15.7 15 0.031 C 14.9 11 1.02 G15.2 12 0.674 T 17.1 25 3.65 5.37 0.1<p<0.25 Position 4 A 15.7 19 0.693C 14.9 11 1.02 G 15.2 20 1.52 T 17.1 13 0.983 4.21 0.1<p<0.25 Position 5A 15.7 12 0.872 C 14.9 18 0.645 G 15.2 12 0.674 T 17.1 21 0.889 3.080.25<p<0.5 Purines and Pyrimidines, each position individually (63 bpexperimental, 7410 bp control) Position 1 R 30.9 36 0.842 Y 32.1 270.810 1.65 0.1<p<0.25 Position 2 R 30.9 19 4.58 Y 32.1 44 4.41 8.990.001<p<0.005 Position 3 R 30.9 27 0.49 Y 32.1 36 0.422 0.914 0.25<p<0.5Position 4 R 30.9 39 2.12 Y 32.1 24 2.04 4.17 0.025<p<0.05 Position 5 R30.9 24 1.54 Y 32.1 39 1.48 3.10 0.05<p<0.1 T or not-T, position 3 T17.16 25 3.58 not-T 45.84 38 1.34 4.92 0.025<p<0.05 Triplets, positions234 (63 experimental triplets, 7408 control triplets to determineexpectation) Triplet Expected Observed Chisquare probability Alltriplets YNR 15.96 25 7.54 RNY 15.97 5 5.12 RNR 14.98 14 0.064 YNY 16.0719 0.534 13.25 0.001<p<0.005 Specific triplets, positions 234 YNR 16 255.06 Not YNR 47 38 1.7 6.78 0.005<p<0.01 RNY 16 5 7.56 Not RNY 47 593.06 10.62 0.001<p<0.005 YTR 3.93 10 9.38 not YTR 59.07 53 0.623 10.00.001<p<0.005 Pairing between position 2 and 4 (GNC, CNG, ANT, TNA)Paired 16.95 16 0.053 Not paired 46.05 47 0.0196 0.073 0.75<p<0.9

[0511] Preference for this site was statistically significant (p<0.005),and preference for each of its parts was also significant (p<0.05).However, the preference is not particularly strong, in thatrepresentation of the site was only 2.5-fold more frequent in insertionsites than expected from the composition of the plasmid; and 53 out of63 sites do not fit the consensus. Each preferred position contributesindependently to the overall preference, since multiplying together theoverrepresentation of each position yields the overrrepresentation ofthe site as a whole (Table 9). TABLE 9 overrepresentation of preferredbases in Tn7 insertion sites Fold overrepresentation Position preferenceexpected observed (Obs/Exp) 2 Y 32.1 44 1.37 3 T 17.6 25 1.42 4 R 30.939 1.26 product ((O/E)2 × (O/E)3 × (O/E)4) 2.46 triplet YTR 3.93 10 2.54

[0512] We conclude that insertion mediated by TnsABC₁₂₇ is extremelyrandom, with only a slight preference for sites of the form NYTRN (SEQID NO:15).

[0513] I. Recipes. 1. 100 X DNA buffer per liter Tris Base  121.1 gDissolve in 700 ml 4 M HCl  ˜90 ml Bring pH to 7.4 Na₂EDTA  37.2 NaCl 29.22 g Make up to ˜950 ml adjust pH Make up to 1 L Aliquot, autoclave2. 1 X DNA buffer 100 x DNA buffer   1 ml dH₂O, sterile  100 ml 3. 10 XFi/L (Fill-in, ligation) buffer 10 X ligase buffer 1500 μl New EnglandBioLabs, 32 Tozer Road, Beverly, Massachusetts, 01915 100 mM dATP   3.75μl New England BioLabs, 32 Tozer Road, Beverly, Massachusetts, 01915 100mM dCTP   3.75 μl New England BioLabs, 32 Tozer Road, Beverly,Massachusetts, 01915 100 mM dGTP   3.75 μl New England BioLabs, 32 TozerRoad, Beverly, Massachusetts, 01915 100 mM dTTP   3.75 μl New EnglandBioLabs, 32 Tozer Road, Beverly, Massachusetts, 01915 4. TE 1 M Tris pH8.0   1 ml 0.5 M EDTA pH 8.0   0.2 ml dH₂O to 100 ml

[0514] Filter sterilise

EXAMPLE 5

[0515] A Convenient Method for Stopping a Transposon Insertion Reaction

[0516] In order to use DNA molecules with transposon insertions, theymust be recovered in vivo. It is most convenient to be able to do thiswithout the labor and losses associated with extraction with organicsolvents and alcohol precipitation. Prior art has suggested, however,that transposition reaction products formed during in vitro insertionexperiments are DNA: protein complexes that are extremely stable;evidence suggests that a chaperone-like activity is required fordisruption of these products. Accordingly, organic extraction was deemedrequired for satisfactory disruption of the complexes.

[0517] This example demonstrates that heat inactivation at 75° C. isadequate for disrupting these complexes or at least for putting theminto a form that can be introduced into the cell by chemicaltransformation.

[0518] Section A. MATERIALS

[0519] A1) PROTEINS

[0520] TnsA 30 μg/ml in 10% glycerol

[0521] TnsB 20 μg/ml in 25% glycerol

[0522] TnsC₁₂₇ 100 μg/ml in 10% glycerol

[0523] Keep stored at −70° C. Sufficient protein for 10 reactions isprovided. At the time of use, keep frozen on dry ice until ready to addto the reaction, and keep on dry ice until returned to the freezer. A2)BUFFER CONSTITUENTS HEPES 0.25 M pH 8.1 Tris [Cl] 0.25 M pH 7.6 BSA  10mg/ml tRNA  50 μg/ml DTT  1 M ATP 100 mM Mg Acetate 375 mM A3)TRANSPOSON DONOR PLASMID 100 μg/ml This is as described for Example 3.A4) TARGET PLASMID pLITMUS28 400 μg/ml A5) OTHER Millicue water Heatblock, 30° C. 1.5 ml microtubes.

[0524] A6) FOR STOPPING THE REACTION

[0525] when using chemically competent cells

[0526] Water bath or heat block, 75° C.

[0527] Water bath or heat block, 65° C.

[0528] Distilled phenol equilibrated with TE

[0529] Chloroform equilibrated with TE

[0530] EtOH for precipitation

[0531] NaCl 3 M

[0532] Water or 1 mM Tris pH 8

[0533] CHEMICALLY COMPETENT TRANSFORMABLE CELLS:

[0534] In this example, we show the use of

[0535] Chemically competent ER 1821 (2×10⁷ transformants/μg of LITMUS

[0536] Chemically competent ER2502 (6×10⁶ transformants/μg of LITMUS

[0537] prepared as in Example 3

[0538] A8) MEDIA

[0539] Rich Broth and Rich Agar (Kan, Amp) prepared as in Example 3.

[0540] Section B. Tn7 in vitro Transposition Reaction Protocol

[0541] 1. Experiment 1. Four Stop Treatments

[0542] Reactions were carried out as in Example 3, using quadruplicatesamples for each of four treatments. At step 12, one of these treatmentswas substituted. For transformation, ER2502 was used.

[0543] Treatment 1: No treatment.

[0544] Treatment 2: Heat treatment at 65° C. for 20 min

[0545] Treatment 3: Heat treatment at 75° C. for 10 min.

[0546] Treatment 4: Phenol extraction once, chloroform extraction once,ethanol precipitation once, resuspension in original volume of TE.

[0547] The results of this experiment are given in Table 9 below andillustrated in FIG. 14 Table 10 Transformants obtained per 1/50th volumeof transposition reaction Replicate #1 #2 #3 #4 nothing 1 0 0 0 65 C 20′1 0 0 0 75 C 10′ 32 24 22 10 phenol + pptn 30 23 20 17

[0548] 2. Experiment 2. Three Stop Treatments

[0549] Reactions were carried out as in Example 3, using duplicatesamples for each of three stop treatments, for two aliquots of TnsB, andfor three volumes of TnsB. At step 12, one of the stop treatments wassubstituted. For transformation, ER1821 was used.

[0550] Treatment 1: Heat treatment at 75° C. for 10 min.

[0551] Treatment 2: Ethanol precipitation only, resuspension in originalvolume of TE

[0552] Treatment 3: Heat treatment at 65° C. for 20 min

[0553] The results of this experiment are given in Table 9 below andillustrated in FIG. 15 TABLE 9 Transformants obtained per 1/50th volumeof transposition reaction, three stop treatments. TnsB Volume 75C 10 minEtOH pptd 65C 20 min. Aliquot (μl) #1 #2 #1 #2 #1 #2 1 1 158 8 13 3 1510 1 1.5 186 0 16 0 3 0 1 2 178 170 13 13 30 16 1 3 454 366 47 21 11 8 21 324 140 21 3 9 2 2 1.5 506 462 58 44 25 25 2 2 1220 1102 88 37 54 18 23 1802 1690 129 126 37 14

[0554] These two experiments demonstrate that heat treatment at 75° C.for 10 min is an adequate method of stopping the transposition reactionand gives as many transformants as treatment with phenol, chloroform andethanol precipitation; whereas no treatment, ethanol precipitationalone, and heat treatment at 65° C. for 20 min is inadequate, giving notransformants or a greatly reduced number of transformants.

EXAMPLE 6

[0555] Storing Three Components of Tn7 Transposase Together

[0556] Convenient routine use of in vitro transposition as a method inmolecular biology would be facilitated if the protein components of thereaction could be stored in a single tube. In this way, variability involume measurement from one experiment to another would be minimized,time and labor would be saved, and reproducibility enhanced. TheTnsABC₁₂₇ transposition reaction described in the foregoing examplesinvolves the addition of three different protein components.

[0557] This example demonstrates that these three protein components ofthe reaction can be mixed and stored together without interfering withthe efficiency of the transposition reaction.

[0558] Section A. MATERIALS

[0559] A1) INDIVIDUAL PROTEINS

[0560] TnsA 30 μg/ml in 10% glycerol

[0561] TnsB 20 μg/ml in 50% glycerol

[0562] TnsC₁₂₇ 100 μg/ml in 50% glycerol

[0563] Keep stored at −70° C.

[0564] A2) MIXED PROTEINS, COMPRISING

[0565] TnsA 7.36 μg/ml

[0566] TnsB 11.3 μg/ml

[0567] TnsC₁₂₇ 18.9 μg/ml

[0568] in 40% glycerol

[0569] A2a) Keep stored at −70□ C., or

[0570] A2b) Keep stored at −20□ C.

[0571] A3) OTHER COMPONENTS

[0572] These are as in example 1, parts A and B; including chemicallycompetent ER2502 (6×10⁶ transformants/μg of LITMUS) prepared as inexample 1.

[0573] Section B. Tn7 IN VITRO TRANSPOSITION REACTION PROTOCOL

[0574] B1. Reaction volume=100 μl

[0575] B2. Experimental variations (2 experiments are shown, reactionswere carried out in quadruplicate).

[0576] Tube 1 Proteins of Al added individually at step 6 below in atotal volume of 5.3 μl

[0577] Tube 2 Mixture of A2a added together at step 6 below in a totalvolume of 5.3 μl

[0578] Tube 3 Mixture of A2b added together at step 6 below in a totalvolume of 5.3 μl (Experiment 2 only)

[0579] B3. Make up a mix as in Example 1, section C

[0580] B4. Dispense mix of step 3 as in Example 1, section C

[0581] B5. Add target DNA as in Example 1, section C. In this example,this is pLITMUS28, 1 μl

[0582] B6. Add to each tube Tube 1 Tube 2 Tube 3 TnsA 1.3 μl (40 ng)TnsB 3 μl (20 ng) TnsC₁₂₇ 1 μl (100 ng) TnsABC₁₂₇ 0 5.3 μl (39 ng A, 5.3μl (39 ng A, 59.9 ng B 59.9 ng B 100.2 ng C₁₂₇) 100.2 ng C₁₂₇)

[0583] B7. Add 1 μl donor DNA (0.1 μg pMCB40) as in example 1C

[0584] B8. Incubate 10 minutes at 30° C. (assembly reaction) as inexample 1C

[0585] B9. Add 4 μl MgAc (375 mM) to each tube as in example 1C

[0586] B10. Incubate 1 hour 30° C. (transposition reaction) as inexample 1C

[0587] B11. Heat Inactivate 75° C. 10 minutes

[0588] B12. Transform using chemically competent cells, as in example 1.

[0589] In this example, selective medium was RB Kan (20 μg/ml) Amp (100μg/ml). Competent cells were ER2502, chemically competent (Example 1,Section D1).

[0590] C. Transformation result:

[0591] Experiment 1: Proteins were stored individually at −70° C. or asa mixture at −70° C. (A2a). In this experiment, the proteins in bothtreatments had suffered the same number of freeze-thaw cycles. 10 μl ofeach 100 μl reaction was transformed, and 100 μl of the 500 μl outgrowthculture was plated. TABLE 10 Transformants obtained per 1/50th volume oftransposition reaction, transposition proteins added as a mixture orindividually. Result is displayed in Figure 16. Replicate avg perStorage #1 #2 #3 #4 Average reaction Individually 27 62 59 41 47 2350 Asa mixture 47 68 60 23 49 2450

[0592] Experiment 2: Proteins were stored individually at −70° C., as amixture at −70° C. (A2a material) or as a mixture at −20° C. (A2bmaterial). In this experiment, the proteins stored individually hadsuffered more freeze-thaw cycles than those stored together. 10 μl ofeach 100 μl reaction was transformed, and 100 μl of the 500 μl outgrowthculture was plated. TABLE 11 Transformants obtained per 1/50th volume oftransposition reaction, transposition proteins added as a mixture orindividually following storage at −20° C. or −70° C. Result is displayedin Figure 17. Replicate avg per Storage #1 #2 #3 #4 Average reactionIndividually 13 38 17 22 22 1100 As a mixture, 167 173 117 218 168 8400−70° C., (A2a) As a mixture, 179 125 219 199 180 9000 −20° C., (A2b)

[0593] These two experiments demonstrate that the three Tns proteins canbe stored together. The difference in experiment 2 between individualstorage and storage together may be attributed to the number offreeze-thaw cycles.

BIBLIOGRAPHY

[0594] 1. Craig N L. 1997. In Ann Rev Biochem. ed. pp. 437-74. PaloAlto: Annual Reviews Inc.

[0595] 2. Kleckner N. 1989. In Mobile DNA. ed. Berg D and Howe M, pp.227-68. Washington, D.C.: American Society for Microbiology.

[0596] 3. van Luenen HGAM, et al. 1994. Nucleic Acids Res. 22:262-94.Rosenzweig B, et al. 1983. Nucleic Acids Res. 11:4201-10

[0597] 5. Mori I, et al. 1988. Proc Natl Acad Sci USA. 85:861-4

[0598] 6. Eide D, et al. 1988. Mol Cell Biol. 8:737-46

[0599] 7. Mizuuchi M, et al. 1993. Cold Spring Harbor Symp. Quant. Biol.58:515-23

[0600] 8. Berg D E. 1989. In Mobile DNA. ed. Berg D and Howe M, pp.185-210. Washington, D.C.: American Society for Microbiology.

[0601] 9. Kirchner J, et al. 1995. Science. 267:1443-4

[0602] 10. Rogers M, et al. 1986. Mol. Gen. Genet. 205:550-6

[0603] 11. Waddell C S, et al. 1988. Genes Dev. 2:137-49

[0604] 12. Samovsky R, et al. 1996. EMBO J. 15:6348-61

[0605] 13. Kubo K M, et al. 1990. J. Bacteriol. 172:2774-8

[0606] 14. Wolkow C A, et al. 1996. Genes Dev. 10:2145-57

[0607] 15. Hauer B, et al. 1984Mol Gen Genet. 194:149-58

[0608] 16. Arciszewska L K, et al. 1989. J. Mol. Biol. 207:35-52

[0609] 17. Lee C-H, et al. 1983. Proc Natl Acad Sci USA. 80:6765-9

[0610] 18. Reyes I, et al. 1987. Plasmid. 18:183-92

[0611] 19. Adzuma K, et al. 1988. Cell. 53:257-66

[0612] 20. Sakai J, et al. 1995. EMBO J. 14:4374-83

[0613] 21. Kleckner N, et al. 1996. Curr Top Microbiol Immunol.204:49-82

[0614] 22. Bainton R, et al. 1991. Cell. 65:805-16

[0615] 23. Bainton R J, et al. 1993. Cell. 72:931-43

[0616] 24. Gamas P, et al. 1992. Nucl. Acids Res. 20:2525-32

[0617] 25. Stellwagen A, et al. 1997. Genetics. 145:573-85

[0618] 26. Stellwagen A, et al. 1997. EMBO J. (in press):

[0619] 27. Sankar P, et al. 1993. J.Bacteriol. 175:5145-52

[0620] 28. Devine S E, et al. 1994. Nucleic Acids Res. 22:3765-72

[0621] 29. Pryciak P M, et al. 1992. Proc Natl Acad Sci USA. 89:9237-41

[0622] 30. Pryciak P M, et al. 1992. Cell. 69:769-80

[0623] 31. Pryciak P M, et al. 1992. Embo J. 11:291-303

[0624] 32. Singh I R, et al. 1997. Proc Natl Acad Sci USA. 94:1304-09

[0625] 33. Kholodii G, et al. 1995. Mol. Microbiol. 17:1189-200

[0626] 34. Radstrom P, et al. 1994. J. Bacteriol. 176:3257-68

[0627] 35. Reimmann C, et al. 1989. Mol Gen Genet. 215:416-24

[0628] 36. Rowland S-J, et al. 1990. Mol. Microbiol. 4:961-75

[0629] 37. Walker, et al., eds. 1983. Techniques in Molecular Biology.New York: MacMillan Publishing Company

[0630] 38. Kunkel. 1985. Proc Natl Acad Sci USA. 82:488-92

[0631] 39. Kunkel, et al. 1987. Methods Enzymol. 154:367-82

[0632] 40. Sambrook J, et al., eds. 1989. Molecular Cloning: ALaboratory Manual. Cold Spring Harbor, N.Y.: Cold Spring Harbor Press

[0633] 41. Dayhoff, et al., eds. 1978. Washington, D.C.: Natl. BiomedRes. Found.

[0634] 42. Miller J H. 1972. In Experiments in Molecular Genetics. ed.pp. Cold Spring Harbor, N.Y.: Cold Spring Harbor Laboratory.

[0635] 43. Elespuru R K, et al. 1979. Environ Mutagen. 1:65-78

[0636] 44. Yarmolinsky M B, et al. 1983. Mol Gen Genet. 192:140-8

[0637] 45. Haniford D B, et al. 1989. Cell. 59:385-94

[0638] 46. McKown R L, et al. 1987. Proc.Natl.Acad.Sci. USA. 84:7807-11

[0639] 47. McKown R L, et al. 1988. J. Bacteriol. 170:352-8

[0640] 48. Johnson R C, et al. 1984. Genetics. 9-18

[0641] 49. Rose M D, et al., eds. 1990. Methods in Yeast Genetics: ALaboratory Course Manual Cold Spring Harbor: Cold Spring HarborLaboratory

[0642] 50. Hughes O. 1993. Host Components of Tn7 Transposition.

[0643] 51. Huisman O, et al. 1987. Genetics. 116:185-9

[0644] 52. DeBoy R, et al. 1996. J. Bacteriol. 178:6184-91

[0645] 53. Flores C, et al. 1990. Nucl. Acids Res. 18:901-11

[0646] 54. Walker J E, et al. 1984. Biochem J. 224:799-815

[0647] 55. Saraste M, et al. 1990. Trends Biochem Sci. 15:430-4

[0648] 56. Sancar A, et al. 1993. Science. 259:1415-20

[0649] 57. Chaconas G, et al. 1985. J Biol Chem. 260:2662-9

[0650] 58. Faelen M, et al. 1978. Nature. 271:580-2

[0651] 59. O'Day K J, et al. 1978. In Microbiology. ed. Schlessinger D,pp. 48-51. Washington, D.C.: American Society of Microbiology.

[0652] 60. Surette M G, et al. 1987. Cell. 49:254-62

[0653] 61. Craigie R, et al. 1987. Cell. 51:493-501

[0654] 62. Koonin E V. 1992. Nucleic Acids Res. 20:1997

[0655] 63. Gary P A, et al. 1996. J. Mol Biol. 257:301-16

[0656] 64. Gwinn M L, et al. 1997. J. Bacteriology. 179:7315-20

[0657] 65. Bender J, et al. 1992. EMBO J. 11:741-50

[0658] 66. Lichtenstein C, et al. 1982. Unique insertion site of Tn7 inE coli chromosome. 297:601-3#001#

1 15 1670 base pairs nucleic acid single linear DNA (genomic) CDS1..1668 1 ATG AGT GCT ACC CGG ATT CAA GCA GTT TAT CGT GAT ACG GGG GTAGAG 48 Met Ser Ala Thr Arg Ile Gln Ala Val Tyr Arg Asp Thr Gly Val Glu 15 10 15 GCT TAT CGT GAT AAT CCT TTT ATC GAG GCC TTA CCA CCA TTA CAA GAG96 Ala Tyr Arg Asp Asn Pro Phe Ile Glu Ala Leu Pro Pro Leu Gln Glu 20 2530 TCA GTG AAT AGT GCT GCA TCA CTG AAA TCC TCT TTA CAG CTT ACT TCC 144Ser Val Asn Ser Ala Ala Ser Leu Lys Ser Ser Leu Gln Leu Thr Ser 35 40 45TCT GAC TTG CAA AAG TCC CGT GTT ATC AGA GCT CAT ACC ATT TGT CGT 192 SerAsp Leu Gln Lys Ser Arg Val Ile Arg Ala His Thr Ile Cys Arg 50 55 60 ATTCCA GAT GAC TAT TTT CAG CCA TTA GGT ACG CAT TTG CTA CTA AGT 240 Ile ProAsp Asp Tyr Phe Gln Pro Leu Gly Thr His Leu Leu Leu Ser 65 70 75 80 GAGCGT ATT TCG GTC ATG ATT CGA GGT GGC TAC GTA GGC AGA AAT CCT 288 Glu ArgIle Ser Val Met Ile Arg Gly Gly Tyr Val Gly Arg Asn Pro 85 90 95 AAA ACAGGA GAT TTA CAA AAG CAT TTA CAA AAT GGT TAT GAG CGT GTT 336 Lys Thr GlyAsp Leu Gln Lys His Leu Gln Asn Gly Tyr Glu Arg Val 100 105 110 CAA ACGGGA GAG TTG GAG ACA TTT CGC TTT GAG GAG GCA CGA TCT ACG 384 Gln Thr GlyGlu Leu Glu Thr Phe Arg Phe Glu Glu Ala Arg Ser Thr 115 120 125 GCA CAAAGC TTA TTG TTA ATT GGT TGT TCT GGT AGT GGG AAG ACG ACC 432 Ala Gln SerLeu Leu Leu Ile Gly Cys Ser Gly Ser Gly Lys Thr Thr 130 135 140 TCT CTTCAT CGT ATT CTA GCC ACG TAT CCT CAG GTG ATT TAC CAT CGT 480 Ser Leu HisArg Ile Leu Ala Thr Tyr Pro Gln Val Ile Tyr His Arg 145 150 155 160 GAACTC AAT GTA GAG CAG GTG GTG TAT TTG AAA ATA GAC TGC TCG CAT 528 Glu LeuAsn Val Glu Gln Val Val Tyr Leu Lys Ile Asp Cys Ser His 165 170 175 AATGGT TCG CTA AAA GAA ATC TGC TTG AAT TTT TTC AGA GCG TTG GAT 576 Asn GlySer Leu Lys Glu Ile Cys Leu Asn Phe Phe Arg Ala Leu Asp 180 185 190 CGAGCC TTG GGC TCG AAC TAT GAG CGT CGT TAT GGC TTA AAA CGT CAT 624 Arg AlaLeu Gly Ser Asn Tyr Glu Arg Arg Tyr Gly Leu Lys Arg His 195 200 205 GGTATA GAA ACC ATG TTG GCT TTG ATG TCG CAA ATA GCC AAT GCA CAT 672 Gly IleGlu Thr Met Leu Ala Leu Met Ser Gln Ile Ala Asn Ala His 210 215 220 GCTTTA GGG TTG TTG GTT ATT GAT GAA ATT CAG CAT TTA AGC CGC TCT 720 Ala LeuGly Leu Leu Val Ile Asp Glu Ile Gln His Leu Ser Arg Ser 225 230 235 240CGT TCG GGT GGA TCT CAA GAG ATG CTG AAC TTT TTT GTG ACG ATG GTG 768 ArgSer Gly Gly Ser Gln Glu Met Leu Asn Phe Phe Val Thr Met Val 245 250 255AAT ATT ATT GGC GTA CCA GTG ATG TTG ATT GGT ACC CCT AAA GCA CGA 816 AsnIle Ile Gly Val Pro Val Met Leu Ile Gly Thr Pro Lys Ala Arg 260 265 270GAG ATT TTT GAG GCT GAT TTG CGG TCT GCA CGT AGA GGG GCA GGG TTT 864 GluIle Phe Glu Ala Asp Leu Arg Ser Ala Arg Arg Gly Ala Gly Phe 275 280 285GGA GCT ATA TTC TGG GAT CCT ATA CAA CAA ACG CAA CGT GGA AAG CCC 912 GlyAla Ile Phe Trp Asp Pro Ile Gln Gln Thr Gln Arg Gly Lys Pro 290 295 300AAT CAA GAG TGG ATC GCT TTT ACG GAT AAT CTT TGG CAA TTA CAG CTT 960 AsnGln Glu Trp Ile Ala Phe Thr Asp Asn Leu Trp Gln Leu Gln Leu 305 310 315320 TTA CAA CGC AAA GAT GCG CTG TTA TCG GAT GAG GTC CGT GAT GTG TGG 1008Leu Gln Arg Lys Asp Ala Leu Leu Ser Asp Glu Val Arg Asp Val Trp 325 330335 TAT GAG CTA AGC CAA GGA GTG ATG GAC ATT GTA GTA AAA CTT TTT GTA 1056Tyr Glu Leu Ser Gln Gly Val Met Asp Ile Val Val Lys Leu Phe Val 340 345350 CTC GCT CAG CTC CGT GCG CTA GCT TTA GGC AAT GAG CGT ATT ACC GCT 1104Leu Ala Gln Leu Arg Ala Leu Ala Leu Gly Asn Glu Arg Ile Thr Ala 355 360365 GGT TTA TTG CGG CAA GTG TAT CAA GAT GAG TTA AAG CCT GTG CAC CCC 1152Gly Leu Leu Arg Gln Val Tyr Gln Asp Glu Leu Lys Pro Val His Pro 370 375380 ATG CTA GAG GCA TTA CGC TCG GGT ATC CCA GAA CGC ATT GCT CGT TAT 1200Met Leu Glu Ala Leu Arg Ser Gly Ile Pro Glu Arg Ile Ala Arg Tyr 385 390395 400 TCT GAT CTA GTC GTT CCC GAG ATT GAT AAA CGG TTA ATC CAA CTT CAG1248 Ser Asp Leu Val Val Pro Glu Ile Asp Lys Arg Leu Ile Gln Leu Gln 405410 415 CTA GAT ATC GCA GCG ATA CAA GAA CAA ACA CCA GAA GAA AAA GCC CTT1296 Leu Asp Ile Ala Ala Ile Gln Glu Gln Thr Pro Glu Glu Lys Ala Leu 420425 430 CAA GAG TTA GAT ACC GAA GAT CAG CGT CAT TTA TAT CTG ATG CTG AAA1344 Gln Glu Leu Asp Thr Glu Asp Gln Arg His Leu Tyr Leu Met Leu Lys 435440 445 GAG GAT TAC GAT TCA AGC CTG TTA ATT CCC ACT ATT AAA AAA GCG TTT1392 Glu Asp Tyr Asp Ser Ser Leu Leu Ile Pro Thr Ile Lys Lys Ala Phe 450455 460 AGC CAG AAT CCA ACG ATG ACA AGA CAA AAG TTA CTG CCT CTT GTT TTG1440 Ser Gln Asn Pro Thr Met Thr Arg Gln Lys Leu Leu Pro Leu Val Leu 465470 475 480 CAG TGG TTG ATG GAA GGC GAA ACG GTA GTG TCA GAA CTA GAA AAGCCC 1488 Gln Trp Leu Met Glu Gly Glu Thr Val Val Ser Glu Leu Glu Lys Pro485 490 495 TCC AAG AGT AAA AAG GTT TCG GCT ATA AAG GTA GTC AAG CCC AGCGAC 1536 Ser Lys Ser Lys Lys Val Ser Ala Ile Lys Val Val Lys Pro Ser Asp500 505 510 TGG GAT AGC TTG CCT GAT ACG GAT TTA CGT TAT ATC TAT TCA CAACGC 1584 Trp Asp Ser Leu Pro Asp Thr Asp Leu Arg Tyr Ile Tyr Ser Gln Arg515 520 525 CAA CCT GAA AAA ACC ATG CAT GAA CGG TTA AAA GGG AAA GGG GTAATA 1632 Gln Pro Glu Lys Thr Met His Glu Arg Leu Lys Gly Lys Gly Val Ile530 535 540 GTG GAT ATG GCG AGC TTA TTT AAA CAA GCA GGT TAG CC 1670 ValAsp Met Ala Ser Leu Phe Lys Gln Ala Gly * 545 550 555 555 amino acidsamino acid linear protein 2 Met Ser Ala Thr Arg Ile Gln Ala Val Tyr ArgAsp Thr Gly Val Glu 1 5 10 15 Ala Tyr Arg Asp Asn Pro Phe Ile Glu AlaLeu Pro Pro Leu Gln Glu 20 25 30 Ser Val Asn Ser Ala Ala Ser Leu Lys SerSer Leu Gln Leu Thr Ser 35 40 45 Ser Asp Leu Gln Lys Ser Arg Val Ile ArgAla His Thr Ile Cys Arg 50 55 60 Ile Pro Asp Asp Tyr Phe Gln Pro Leu GlyThr His Leu Leu Leu Ser 65 70 75 80 Glu Arg Ile Ser Val Met Ile Arg GlyGly Tyr Val Gly Arg Asn Pro 85 90 95 Lys Thr Gly Asp Leu Gln Lys His LeuGln Asn Gly Tyr Glu Arg Val 100 105 110 Gln Thr Gly Glu Leu Glu Thr PheArg Phe Glu Glu Ala Arg Ser Thr 115 120 125 Ala Gln Ser Leu Leu Leu IleGly Cys Ser Gly Ser Gly Lys Thr Thr 130 135 140 Ser Leu His Arg Ile LeuAla Thr Tyr Pro Gln Val Ile Tyr His Arg 145 150 155 160 Glu Leu Asn ValGlu Gln Val Val Tyr Leu Lys Ile Asp Cys Ser His 165 170 175 Asn Gly SerLeu Lys Glu Ile Cys Leu Asn Phe Phe Arg Ala Leu Asp 180 185 190 Arg AlaLeu Gly Ser Asn Tyr Glu Arg Arg Tyr Gly Leu Lys Arg His 195 200 205 GlyIle Glu Thr Met Leu Ala Leu Met Ser Gln Ile Ala Asn Ala His 210 215 220Ala Leu Gly Leu Leu Val Ile Asp Glu Ile Gln His Leu Ser Arg Ser 225 230235 240 Arg Ser Gly Gly Ser Gln Glu Met Leu Asn Phe Phe Val Thr Met Val245 250 255 Asn Ile Ile Gly Val Pro Val Met Leu Ile Gly Thr Pro Lys AlaArg 260 265 270 Glu Ile Phe Glu Ala Asp Leu Arg Ser Ala Arg Arg Gly AlaGly Phe 275 280 285 Gly Ala Ile Phe Trp Asp Pro Ile Gln Gln Thr Gln ArgGly Lys Pro 290 295 300 Asn Gln Glu Trp Ile Ala Phe Thr Asp Asn Leu TrpGln Leu Gln Leu 305 310 315 320 Leu Gln Arg Lys Asp Ala Leu Leu Ser AspGlu Val Arg Asp Val Trp 325 330 335 Tyr Glu Leu Ser Gln Gly Val Met AspIle Val Val Lys Leu Phe Val 340 345 350 Leu Ala Gln Leu Arg Ala Leu AlaLeu Gly Asn Glu Arg Ile Thr Ala 355 360 365 Gly Leu Leu Arg Gln Val TyrGln Asp Glu Leu Lys Pro Val His Pro 370 375 380 Met Leu Glu Ala Leu ArgSer Gly Ile Pro Glu Arg Ile Ala Arg Tyr 385 390 395 400 Ser Asp Leu ValVal Pro Glu Ile Asp Lys Arg Leu Ile Gln Leu Gln 405 410 415 Leu Asp IleAla Ala Ile Gln Glu Gln Thr Pro Glu Glu Lys Ala Leu 420 425 430 Gln GluLeu Asp Thr Glu Asp Gln Arg His Leu Tyr Leu Met Leu Lys 435 440 445 GluAsp Tyr Asp Ser Ser Leu Leu Ile Pro Thr Ile Lys Lys Ala Phe 450 455 460Ser Gln Asn Pro Thr Met Thr Arg Gln Lys Leu Leu Pro Leu Val Leu 465 470475 480 Gln Trp Leu Met Glu Gly Glu Thr Val Val Ser Glu Leu Glu Lys Pro485 490 495 Ser Lys Ser Lys Lys Val Ser Ala Ile Lys Val Val Lys Pro SerAsp 500 505 510 Trp Asp Ser Leu Pro Asp Thr Asp Leu Arg Tyr Ile Tyr SerGln Arg 515 520 525 Gln Pro Glu Lys Thr Met His Glu Arg Leu Lys Gly LysGly Val Ile 530 535 540 Val Asp Met Ala Ser Leu Phe Lys Gln Ala Gly 545550 555 5926 base pairs nucleic acid single circular other nucleic acid/desc = “pEM delta R.adj to 1” 3 TTTAGAGCAA TTCGGTGTTA GTTTCAGCAAGCAAACATTA ACCATAGCTA ATGATTTATA 60 GCCATATTAA CCATTGGGGT ACCGAGCTCGAATTCCATGG TCTGTTTCCT GTGTGAAATT 120 GTTATCCGCT CACAATTCCA CACATTATACGAGCCGGATG ATTAATTGTC AACAGCTCAT 180 TTCAGAATAT TTGCCAGAAC CGTTATGATGTCGGCGCAAA AAACATTATC CAGAACGGGA 240 GTGCGCCTTG AGCGACACGA ATTATGCAGTGATTTACGAC CTGCACAGCC ATACCACAGC 300 TTCCGATGGC TGCCTGACGC CAGAAGCATTGGTGCACCGT GCAGTCGATG ATAAGCTGTC 360 AAACCAGATC AATTCGCGCT AACTCACATTAATTGCGTTG CGCTCACTGC CCGCTTTCCA 420 GTCGGGAAAC CTGTCGTGCC AGCTGCATTAATGAATCGGC CAACGCGCGG GGAGAGGCGG 480 TTTGCGTATT GGGCGCCAGG GTGGTTTTTCTTTTCACCAG TGAGACGGGC AACAGCTGAT 540 TGCCCTTCAC CGCCTGGCCC TGAGAGAGTTGCAGCAAGCG GTCCACGCTG GTTTGCCCCA 600 GCAGGCGAAA ATCCTGTTTG ATGGTGGTTGACGGCGGGAT ATAACATGAG CTGTCTTCGG 660 TATCGTCGTA TCCCACTACC GAGATATCCGCACCAACGCG CAGCCCGGAC TCGGTAATGG 720 CGCGCATTGC GCCCAGCGCC ATCTGATCGTTGGCAACCAG CATCGCAGTG GGAACGATGC 780 CCTCATTCAG CATTTGCATG GTTTGTTGAAAACCGGACAT GGCACTCCAG TCGCCTTCCC 840 GTTCCGCTAT CGGCTGAATT TGATTGCGAGTGAGATATTT ATGCCAGCCA GCCAGACGCA 900 GACGCGCCGA GACAGAACTT AATGGGCCCGCTAACAGCGC GATTTGCTGG TGACCCAATG 960 CGACCAGATG CTCCACGCCC AGTCGCGTACCGTCTTCATG GGAGAAAATA ATACTGTTGA 1020 TGGGTGTCTG GTCAGAGACA TCAAGAAATAACGCCGGAAC ATTAGTGCAG GCAGCTTCCA 1080 CAGCAATGGC ATCCTGGTCA TCCAGCGGATAGTTAATGAT CAGCCCACTG ACGCGTTGCG 1140 CGAGAAGATT GTGCACCGCC GCTTTACAGGCTTCGACGCC GCTTCGTTCT ACCATCGACA 1200 CCACCACGCT GGCACCCAGT TGATCGGCGCGAGATTTAAT CGCCGCGACA ATTTGCGACG 1260 GCGCGTGCAG GGCCAGACTG GAGGTGGCAACGCCAATCAG CAACGACTGT TTGCCCGCCA 1320 GTTGTTGTGC CACGCGGTTG GGAATGTAATTCAGCTCCGC CATCGCCGCT TCCACTTTTT 1380 CCCGCGTTTT CGCAGAAACG TGGCTGGCCTGGTTCACCAC GCGGGAAACG GTCTGATAAG 1440 AGACACCGGC ATACTCTGCG ACATCGTATAACGTTACTGG TTTCACATTC ACCACCCTGA 1500 ATTGACTCTC TTCCGGGCGC TATCATGCCATACCGCGAAA GGTTTTGCAC CATTCGATGG 1560 TGTCAACGTA AATGCATGCC GCTTCGCCTTCGCGCGCGAA TTGATCTGCT GCCTCGCGCG 1620 TTTCGGTGAT GACGGTGAAA ACCTCTGACACATGCAGCTC CCGGAGACGG TCACAGCTTG 1680 TCTGTAAGCG GATGCCGGGA GCAGACAAGCCCGTCAGGGC GCGTCAGCGG GTGTTGGCGG 1740 GTGTCGGGGC GCAGCCATGA CCCAGTCACGTAGCGATAGC GGAGTGTATA CTGGCTTAAC 1800 TATGCGGCAT CAGAGCAGAT TGTACTGAGAGTGCACCATA TGCGGTGTGA AATACCGCAC 1860 AGATGCGTAA GGAGAAAATA CCGCATCAGGCGCTCTTCCG CTTCCTCGCT CACTGACTCG 1920 CTGCGCTCGG TCGTTCGGCT GCGGCGAGCGGTATCAGCTC ACTCAAAGGC GGTAATACGG 1980 TTATCCACAG AATCAGGGGA TAACGCAGGAAAGAACATGT GAGCAAAAGG CCAGCAAAAG 2040 GCCAGGAACC GTAAAAAGGC CGCGTTGCTGGCGTTTTTCC ATAGGCTCCG CCCCCCTGAC 2100 GAGCATCACA AAAATCGACG CTCAAGTCAGAGGTGGCGAA ACCCGACAGG ACTATAAAGA 2160 TACCAGGCGT TTCCCCCTGG AAGCTCCCTCGTGCGCTCTC CTGTTCCGAC CCTGCCGCTT 2220 ACCGGATACC TGTCCGCCTT TCTCCCTTCGGGAAGCGTGG CGCTTTCTCA TAGCTCACGC 2280 TGTAGGTATC TCAGTTCGGT GTAGGTCGTTCGCTCCAAGC TGGGCTGTGT GCACGAACCC 2340 CCCGTTCAGC CCGACCGCTG CGCCTTATCCGGTAACTATC GTCTTGAGTC CAACCCGGTA 2400 AGACACGACT TATCGCCACT GGCAGCAGCCACTGGTAACA GGATTAGCAG AGCGAGGTAT 2460 GTAGGCGGTG CTACAGAGTT CTTGAAGTGGTGGCCTAACT ACGGCTACAC TAGAAGGACA 2520 GTATTTGGTA TCTGCGCTCT GCTGAAGCCAGTTACCTTCG GAAAAAGAGT TGGTAGCTCT 2580 TGATCCGGCA AACAAACCAC CGCTGGTAGCGGTGGTTTTT TTGTTTGCAA GCAGCAGATT 2640 ACGCGCAGAA AAAAAGGATC TCAAGAAGATCCTTTGATCT TTTCTACGGG GTCTGACGCT 2700 CAGTGGAACG AAAACTCACG TTAAGGGATTTTGGTCATGA GATTATCAAA AAGGATCTTC 2760 ACCTAGATCC TTTTAAATTA AAAATGAAGTTTTAAATCAA TCTAAAGTAT ATATGAGTAA 2820 ACTTGGTCTG ACAGTTACCA ATGCTTAATCAGTGAGGCAC CTATCTCAGC GATCTGTCTA 2880 TTTCGTTCAT CCATAGTTGC CTGACTCCCCGTCGTGTAGA TAACTACGAT ACGGGAGGGC 2940 TTACCATCTG GCCCCAGTGC TGCAATGATACCGCGAGACC CACGCTCACC GGCTCCAGAT 3000 TTATCAGCAA TAAACCAGCC AGCCGGAAGGGCCGAGCGCA GAAGTGGTCC TGCAACTTTA 3060 TCCGCCTCCA TCCAGTCTAT TAATTGTTGCCGGGAAGCTA GAGTAAGTAG TTCGCCAGTT 3120 AATAGTTTGC GCAACGTTGT TGCCATTGCTGTAGGCATCG TGGTGTCACG CTCGTCGTTT 3180 GGTATGGCTT CATTCAGCTC CGGTTCCCAACGATCAAGGC GAGTTACATG ATCCCCCATG 3240 TTGTGCAAAA AAGCGGTTAG CTCCTTCGGTCCTCCGATCG TTGTCAGAAG TAAGTTGGCC 3300 GCAGTGTTAT CACTCATGGT TATGGCAGCACTGCATAATT CTCTTACTGT CATGCCATCC 3360 GTAAGATGCT TTTCTGTGAC TGGTGAGTACTCAACCAAGT CATTCTGAGA ATAGTGTATG 3420 CGGCGACCGA GTTGCTCTTG CCCGGCGTCAACACGGGATA ATACCGCGCC ACATAGCAGA 3480 ACTTTAAAAG TGCTCATCAT TGGAAAACGTTCTTCGGGGC GAAAACTCTC AAGGATCTTA 3540 CCGCTGTTGA GATCCAGTTC GATGTAACCCACTCGTGCAC CCAACTGATC TTCAGCATCT 3600 TTTACTTTCA CCAGCGTTTC TGGGTGAGCAAAAACAGGAA GGCAAAATGC CGCAAAAAAG 3660 GGAATAAGGG CGACACGGAA ATGTTGAATACTCATACTCT TCCTTTTTCA ATATTATTGA 3720 AGCATTTATC AGGGTTATTG TCTCATGAGCGGATACATAT TTGAATGTAT TTAGAAAAAT 3780 AAACAAAAAG AGTTTGTAGA AACGCAAAAAGGCCATCCGT CAGGATGGCC TTCTGCTTAA 3840 TTTGATGCCT GGCAGTTTAT GGCGGGCGTCCTGCCCGCCA CCCTCCGGGC CGTTGCTTCG 3900 CAACGTTCAA ATCCGCTCCC GGCGGATTTGTCCTACTCAG GAGAGCGTTC ACCGACAAAC 3960 AACAGATAAA ACGAAAGGCC CAGTCTTTCGACTGAGCCTT TCGTTTTATT TGATGCCTGG 4020 CAGTTCCCTA CTCTCGCATG GGGAGACCCCACACTACCAT CGGCGCTACG GCGTTTCACT 4080 TCTGAGTTCG GCATGGGGTC AGGTGGGACCACCGCGCTAC TGCCGCCAGG CAAATTCTGT 4140 TTTATCAGAC CGCTTCTGCG TTCTGATTTAATCTGTATCA GGCTGAAAAT CTTCTCTCAT 4200 CCGCCAAAAC AGCCAAGCTT GCATGCCTGCAGGTCGACTC TAGAGGATCC CCAAGAAAGT 4260 CCGTCGGACA GCTTTAATAA ACCCTGCACTTATCTGTTTA GTGTGGGCGG ACAAAATAGT 4320 TGGGAACTGG GAGGGGTGGA AATGGAGTTTTTAAGGATTA TTTAGGGAAG AGTGACAAAA 4380 TAGATGGGAA CTGGGTGTAG CGTCGTAAGCTAATACGAAA ATTAAAAATG ACAAAATAGT 4440 TTGGAACTAG ATTTCACTTA TCTGGTTGGTCGACCTGCAG GGGGGGGGGG GAAAGCCACG 4500 TTGTGTCTCA AAATCTCTGA TGTTACATTGCACAAGATAA AAATATATCA TCATGAACAA 4560 TAAAACTGTC TGCTTACATA AACAGTAATACAAGGGGTGT TATGAGCCAT ATTCAACGGG 4620 AAACGTCTTG CTCGAGGCCG CGATTAAATTCCAACATGGA TGCTGATTTA TATGGGTATA 4680 AATGGGCTCG CGATAATGTC GGGCAATCAGGTGCGACAAT CTATCGATTG TATGGGAAGC 4740 CCGATGCGCC AGAGTTGTTT CTGAAACATGGCAAAGGTAG CGTTGCCAAT GATGTTACAG 4800 ATGAGATGGT CAGACTAAAC TGGCTGACGGAATTTATGCC TCTTCCGACC ATCAAGCATT 4860 TTATCCGTAC TCCTGATGAT GCATGGTTACTCACCACTGC GATCCCCGGG AAAACAGCAT 4920 TCCAGGTATT AGAAGAATAT CCTGATTCAGGTGAAAATAT TGTTGATGCG CTGGCAGTGT 4980 TCCTGCGCCG GTTGCATTCG ATTCCTGTTTGTAATTGTCC TTTTAACAGC GATCGCGTAT 5040 TTCGTCTCGC TCAGGCGCAA TCACGAATGAATAACGGTTT GGTTGATGCG AGTGATTTTG 5100 ATGACGAGCG TAATGGCTGG CCTGTTGAACAAGTCTGGAA AGAAATGCAT AAGCTTTTGC 5160 CATTCTCACC GGATTCAGTC GTCACTCATGGTGATTTCTC ACTTGATAAC CTTATTTTTG 5220 ACGAGGGGAA ATTAATAGGT TGTATTGATGTTGGACGAGT CGGAATCGCA GACCGATACC 5280 AGGATCTTGC CATCCTATGG AACTGCCTCGGTGAGTTTTC TCCTTCATTA CAGAAACGGC 5340 TTTTTCAAAA ATATGGTATT GATAATCCTGATATGAATAA ATTGCAGTTT CATTTGATGC 5400 TCGATGAGTT TTTCTAATCA GAATTGGTTAATTGGTTGTA ACACTGGCAG AGCATTACGC 5460 TGACTTGACG GGACGGCGGC TTTGTTGAATAAATCGAACT TTTGCTGAGT TGAAGGATCA 5520 GATCACGCAT CTTCCCGACA ACGCAGACCGTTCCGTGGCA AAGCAAAAGT TCAAAATCAC 5580 CAACTGGTCC ACCTACAACA AAGCTCTCATCAACCGTGGC TCCCTCACTT TCTGGCTGGA 5640 TGATGGGGCG ATTCAGGCCT GGTATGAGTCAGCAACACCT TCTTCACGAG GCAGACCTCA 5700 GCGCCCCCCC CCCCCTGCAG GTCGACCCCACGCCCCTCTT TAATACGACG GGCAATTTGC 5760 ACTTCAGAAA ATGAAGAGTT TGCTTTAGCCATAACAAAAG TCCAGTATGC TTTTTCACAG 5820 CATAACTGGA CTGATTTCAG TTTACAACTATTCTGTCTAG TTTAAGACTT TATTGTCATA 5880 GTTTAGATCT ATTTTGTTCA GTTTAAGACTTTATTGTCCG CCCACA 5926 5926 base pairs nucleic acid single circularother nucleic acid /desc = “pEM-delta” 4 CAGATCAATT CGCGCTAACTCACATTAATT GCGTTGCGCT CACTGCCCGC TTTCCAGTCG 60 GGAAACCTGT CGTGCCAGCTGCATTAATGA ATCGGCCAAC GCGCGGGGAG AGGCGGTTTG 120 CGTATTGGGC GCCAGGGTGGTTTTTCTTTT CACCAGTGAG ACGGGCAACA GCTGATTGCC 180 CTTCACCGCC TGGCCCTGAGAGAGTTGCAG CAAGCGGTCC ACGCTGGTTT GCCCCAGCAG 240 GCGAAAATCC TGTTTGATGGTGGTTGACGG CGGGATATAA CATGAGCTGT CTTCGGTATC 300 GTCGTATCCC ACTACCGAGATATCCGCACC AACGCGCAGC CCGGACTCGG TAATGGCGCG 360 CATTGCGCCC AGCGCCATCTGATCGTTGGC AACCAGCATC GCAGTGGGAA CGATGCCCTC 420 ATTCAGCATT TGCATGGTTTGTTGAAAACC GGACATGGCA CTCCAGTCGC CTTCCCGTTC 480 CGCTATCGGC TGAATTTGATTGCGAGTGAG ATATTTATGC CAGCCAGCCA GACGCAGACG 540 CGCCGAGACA GAACTTAATGGGCCCGCTAA CAGCGCGATT TGCTGGTGAC CCAATGCGAC 600 CAGATGCTCC ACGCCCAGTCGCGTACCGTC TTCATGGGAG AAAATAATAC TGTTGATGGG 660 TGTCTGGTCA GAGACATCAAGAAATAACGC CGGAACATTA GTGCAGGCAG CTTCCACAGC 720 AATGGCATCC TGGTCATCCAGCGGATAGTT AATGATCAGC CCACTGACGC GTTGCGCGAG 780 AAGATTGTGC ACCGCCGCTTTACAGGCTTC GACGCCGCTT CGTTCTACCA TCGACACCAC 840 CACGCTGGCA CCCAGTTGATCGGCGCGAGA TTTAATCGCC GCGACAATTT GCGACGGCGC 900 GTGCAGGGCC AGACTGGAGGTGGCAACGCC AATCAGCAAC GACTGTTTGC CCGCCAGTTG 960 TTGTGCCACG CGGTTGGGAATGTAATTCAG CTCCGCCATC GCCGCTTCCA CTTTTTCCCG 1020 CGTTTTCGCA GAAACGTGGCTGGCCTGGTT CACCACGCGG GAAACGGTCT GATAAGAGAC 1080 ACCGGCATAC TCTGCGACATCGTATAACGT TACTGGTTTC ACATTCACCA CCCTGAATTG 1140 ACTCTCTTCC GGGCGCTATCATGCCATACC GCGAAAGGTT TTGCACCATT CGATGGTGTG 1200 AACGTAAATG CATGCCGCTTCGCCTTCGCG CGCGAATTGA TCTGCTGCCT CGCGCGTTTC 1260 GGTGATGACG GTGAAAACCTCTGACACATG CAGCTCCCGG AGACGGTCAC AGCTTGTCTG 1320 TAAGCGGATG CCGGGAGCAGACAAGCCCGT CAGGGCGCGT CAGCGGGTGT TGGCGGGTGT 1380 CGGGGCGCAG CCATGACCCAGTCACGTAGC GATAGCGGAG TGTATACTGG CTTAACTATG 1440 CGGCATCAGA GCAGATTGTACTGAGAGTGC ACCATATGCG GTGTGAAATA CCGCACAGAT 1500 GCGTAAGGAG AAAATACCGCATCAGGCGCT CTTCCGCTTC CTCGCTCACT GACTCGCTGC 1560 GCTCGGTCGT TCGGCTGCGGCGAGCGGTAT CAGCTCACTC AAAGGCGGTA ATACGGTTAT 1620 CCACAGAATC AGGGGATAACGCAGGAAAGA ACATGTGAGC AAAAGGCCAG CAAAAGGCCA 1680 GGAACCGTAA AAAGGCCGCGTTGCTGGCGT TTTTCCATAG GCTCCGCCCC CCTGACGAGC 1740 ATCACAAAAA TCGACGCTCAAGTCAGAGGT GGCGAAACCC GACAGGACTA TAAAGATACC 1800 AGGCGTTTCC CCCTGGAAGCTCCCTCGTGC GCTCTCCTGT TCCGACCCTG CCGCTTACCG 1860 GATACCTGTC CGCCTTTCTCCCTTCGGGAA GCGTGGCGCT TTCTCATAGC TCACGCTGTA 1920 GGTATCTCAG TTCGGTGTAGGTCGTTCGCT CCAAGCTGGG CTGTGTGCAC GAACCCCCCG 1980 TTCAGCCCGA CCGCTGCGCCTTATCCGGTA ACTATCGTCT TGAGTCCAAC CCGGTAAGAC 2040 ACGACTTATC GCCACTGGCAGCAGCCACTG GTAACAGGAT TAGCAGAGCG AGGTATGTAG 2100 GCGGTGCTAC AGAGTTCTTGAAGTGGTGGC CTAACTACGG CTACACTAGA AGGACAGTAT 2160 TTGGTATCTG CGCTCTGCTGAAGCCAGTTA CCTTCGGAAA AAGAGTTGGT AGCTCTTGAT 2220 CCGGCAAACA AACCACCGCTGGTAGCGGTG GTTTTTTTGT TTGCAAGCAG CAGATTACGC 2280 GCAGAAAAAA AGGATCTCAAGAAGATCCTT TGATCTTTTC TACGGGGTCT GACGCTCAGT 2340 GGAACGAAAA CTCACGTTAAGGGATTTTGG TCATGAGATT ATCAAAAAGG ATCTTCACCT 2400 AGATCCTTTT AAATTAAAAATGAAGTTTTA AATCAATCTA AAGTATATAT GAGTAAACTT 2460 GGTCTGACAG TTACCAATGCTTAATCAGTG AGGCACCTAT CTCAGCGATC TGTCTATTTC 2520 GTTCATCCAT AGTTGCCTGACTCCCCGTCG TGTAGATAAC TACGATACGG GAGGGCTTAC 2580 CATCTGGCCC CAGTGCTGCAATGATACCGC GAGACCCACG CTCACCGGCT CCAGATTTAT 2640 CAGCAATAAA CCAGCCAGCCGGAAGGGCCG AGCGCAGAAG TGGTCCTGCA ACTTTATCCG 2700 CCTCCATCCA GTCTATTAATTGTTGCCGGG AAGCTAGAGT AAGTAGTTCG CCAGTTAATA 2760 GTTTGCGCAA CGTTGTTGCCATTGCTGTAG GCATCGTGGT GTCACGCTCG TCGTTTGGTA 2820 TGGCTTCATT CAGCTCCGGTTCCCAACGAT CAAGGCGAGT TACATGATCC CCCATGTTGT 2880 GCAAAAAAGC GGTTAGCTCCTTCGGTCCTC CGATCGTTGT CAGAAGTAAG TTGGCCGCAG 2940 TGTTATCACT CATGGTTATGGCAGCACTGC ATAATTCTCT TACTGTCATG CCATCCGTAA 3000 GATGCTTTTC TGTGACTGGTGAGTACTCAA CCAAGTCATT CTGAGAATAG TGTATGCGGC 3060 GACCGAGTTG CTCTTGCCCGGCGTCAACAC GGGATAATAC CGCGCCACAT AGCAGAACTT 3120 TAAAAGTGCT CATCATTGGAAAACGTTCTT CGGGGCGAAA ACTCTCAAGG ATCTTACCGC 3180 TGTTGAGATC CAGTTCGATGTAACCCACTC GTGCACCCAA CTGATCTTCA GCATCTTTTA 3240 CTTTCACCAG CGTTTCTGGGTGAGCAAAAA CAGGAAGGCA AAATGCCGCA AAAAAGGGAA 3300 TAAGGGCGAC ACGGAAATGTTGAATACTCA TACTCTTCCT TTTTCAATAT TATTGAAGCA 3360 TTTATCAGGG TTATTGTCTCATGAGCGGAT ACATATTTGA ATGTATTTAG AAAAATAAAC 3420 AAAAAGAGTT TGTAGAAACGCAAAAAGGCC ATCCGTCAGG ATGGCCTTCT GCTTAATTTG 3480 ATGCCTGGCA GTTTATGGCGGGCGTCCTGC CCGCCACCCT CCGGGCCGTT GCTTCGCAAC 3540 GTTCAAATCC GCTCCCGGCGGATTTGTCCT ACTCAGGAGA GCGTTCACCG ACAAACAACA 3600 GATAAAACGA AAGGCCCAGTCTTTCGACTG AGCCTTTCGT TTTATTTGAT GCCTGGCAGT 3660 TCCCTACTCT CGCATGGGGAGACCCCACAC TACCATCGGC GCTACGGCGT TTCACTTCTG 3720 AGTTCGGCAT GGGGTCAGGTGGGACCACCG CGCTACTGCC GCCAGGCAAA TTCTGTTTTA 3780 TCAGACCGCT TCTGCGTTCTGATTTAATCT GTATCAGGCT GAAAATCTTC TCTCATCCGC 3840 CAAAACAGCC AAGCTTGCATGCCTGCAGGT CGACTCTAGA GGATCCCCAA GAAAGTCCGT 3900 CGGACAGCTT TAATAAACCCTGCACTTATC TGTTTAGTGT GGGCGGACAA AATAGTTGGG 3960 AACTGGGAGG GGTGGAAATGGAGTTTTTAA GGATTATTTA GGGAAGAGTG ACAAAATAGA 4020 TGGGAACTGG GTGTAGCGTCGTAAGCTAAT ACGAAAATTA AAAATGACAA AATAGTTTGG 4080 AACTAGATTT CACTTATCTGGTTGGTCGAC CTGCAGGGGG GGGGGGGAAA GCCACGTTGT 4140 GTCTCAAAAT CTCTGATGTTACATTGCACA AGATAAAAAT ATATCATCAT GAACAATAAA 4200 ACTGTCTGCT TACATAAACAGTAATACAAG GGGTGTTATG AGCCATATTC AACGGGAAAC 4260 GTCTTGCTCG AGGCCGCGATTAAATTCCAA CATGGATGCT GATTTATATG GGTATAAATG 4320 GGCTCGCGAT AATGTCGGGCAATCAGGTGC GACAATCTAT CGATTGTATG GGAAGCCCGA 4380 TGCGCCAGAG TTGTTTCTGAAACATGGCAA AGGTAGCGTT GCCAATGATG TTACAGATGA 4440 GATGGTCAGA CTAAACTGGCTGACGGAATT TATGCCTCTT CCGACCATCA AGCATTTTAT 4500 CCGTACTCCT GATGATGCATGGTTACTCAC CACTGCGATC CCCGGGAAAA CAGCATTCCA 4560 GGTATTAGAA GAATATCCTGATTCAGGTGA AAATATTGTT GATGCGCTGG CAGTGTTCCT 4620 GCGCCGGTTG CATTCGATTCCTGTTTGTAA TTGTCCTTTT AACAGCGATC GCGTATTTCG 4680 TCTCGCTCAG GCGCAATCACGAATGAATAA CGGTTTGGTT GATGCGAGTG ATTTTGATGA 4740 CGAGCGTAAT GGCTGGCCTGTTGAACAAGT CTGGAAAGAA ATGCATAAGC TTTTGCCATT 4800 CTCACCGGAT TCAGTCGTCACTCATGGTGA TTTCTCACTT GATAACCTTA TTTTTGACGA 4860 GGGGAAATTA ATAGGTTGTATTGATGTTGG ACGAGTCGGA ATCGCAGACC GATACCAGGA 4920 TCTTGCCATC CTATGGAACTGCCTCGGTGA GTTTTCTCCT TCATTACAGA AACGGCTTTT 4980 TCAAAAATAT GGTATTGATAATCCTGATAT GAATAAATTG CAGTTTCATT TGATGCTCGA 5040 TGAGTTTTTC TAATCAGAATTGGTTAATTG GTTGTAACAC TGGCAGAGCA TTACGCTGAC 5100 TTGACGGGAC GGCGGCTTTGTTGAATAAAT CGAACTTTTG CTGAGTTGAA GGATCAGATC 5160 ACGCATCTTC CCGACAACGCAGACCGTTCC GTGGCAAAGC AAAAGTTCAA AATCACCATT 5220 TGGTCCACCT ACAACAAAGCTCTCATCAAC CGTGGCTCCC TCACTTTCTG GCTGGATGAT 5280 GGGGCGATTC AGGCCTGGTATGAGTCAGCA ACACCTTCTT CACGAGGCAG ACCTCAGCGC 5340 CCCCCCCCCC CTGCAGGTCGACCCCACGCC CCTCTTTAAT ACGACGGGCA ATTTGCACTT 5400 CAGAAAATGA AGAGTTTGCTTTAGCCATAA CAAAAGTCCA GTATGCTTTT TCACAGCATA 5460 ACTGGACTGA TTTCAGTTTACAACTATTCT GTCTAGTTTA AGACTTTATT GTCATAGTTT 5520 AGATCTATTT TGTTCAGTTTAAGACTTTAT TGTCCGCCCA CATTTAGAGC AATTCGGTGT 5580 TAGTTTCAGC AAGCAAACATTAACCATAGC TAATGATTTA TAGCCATATT AACCATTGGG 5640 GTACCGAGCT CGAATTCCATGGTCTGTTTC CTGTGTGAAA TTGTTATCCG CTCACAATTC 5700 CACACATTAT ACGAGCCGGATGATTAATTG TCAACAGCTC ATTTCAGAAT ATTTGCCAGA 5760 ACCGTTATGA TGTCGGCGCAAAAAACATTA TCCAGAACGG GAGTGCGCCT TGAGCGACAC 5820 GAATTATGCA GTGATTTACGACCTGCACAG CCATACCACA GCTTCCGATG GCTGCCTGAC 5880 GCCAGAAGCA TTGGTGCACCGTGCAGTCGA TGATAAGCTG TCAAAC 5926 8906 base pairs nucleic acid singlecircular other nucleic acid /desc = “pER183 (target plasmid)” 5GAATTCCGGA TGAGCATTCA TCAGGCGGGC AAGAATGTGA ATAAAGGCCG GATAAAACTT 60GTGCTTATTT TTCTTTACGG TCTTTAAAAA GGCCGTAATA TCCAGCTGAA CGGTCTGGTT 120ATAGGTACAT TGAGCAACTG ACTGAAATGC CTCAAAATGT TCTTTACGAT GCCATTGGGA 180TATATCAACG GTGGTATATC CAGTGATTTT TTTCTCCATT TTAGCTTCCT TAGCTCCTGA 240AAATCTCGAT AACTCAAAAA ATACGCCCGG TAGTGATCTT ATTTCATTAT GGTGAAAGTT 300GGAACCTCTT ACGTGCCGAT CAACGTCTCA TTTTCGCCAA AAGTTGGCCC AGGGCTTCCC 360GGTATCAACA GGGACACCAG GATTTATTTA TTCTGCGAAG TGATCTTCCG TCACAGGTAT 420TTATTCGGCG CAAAGTGCGT CGGGTGATGC TGCCAACTTA CTGATTTAGT GTATGATGGT 480GTTTTTGAGG TGCTCCAGTG GCTTCTGTTT CTATCAGCTG TCCCTCCTGT TCAGCTACTG 540ACGGGGTGGT GCGTAACGGC AAAAGCACCG CCGGACATCA GCGCTAGCGG AGTGTATACT 600GGCTTACTAT GTTGGCACTG ATGAGGGTGT CAGTGAAGTG CTTCATGTGG CAGGAGAAAA 660AAGGCTGCAC CGGTGCGTCA GCAGAATATG TGATACAGGA TATATTCCGC TTCCTCGCTC 720ACTGACTCGC TACGCTCGGT CGTTCGACTG CGGCGAGCGG AAATGGCTTA CGAACGGGGC 780GGAGATTTCC TGGAAGATGC CAGGAAGATA CTTAACAGGG AAGTGAGAGG GCCGCGGCAA 840AGCCGTTTTT CCATAGGCTC CGCCCCCCTG ACAAGCATCA CGAAATCTGA CGCTCAAATC 900AGTGGTGGCG AAACCCGACA GGACTATAAA GATACCAGGC GTTTCCCCTG GCGGCTCCCT 960CGTGCGCTCT CCTGTTCCTG CCTTTCGGTT TACCGGTGTC ATTCCGCTGT TATGGCCGCG 1020TTTGTCTCAT TCCACGCCTG ACACTCAGTT CCGGGTAGGC AGTTCGCTCC AAGCTGGACT 1080GTATGCACGA ACCCCCCGTT CAGTCCGACC GCTGCGCCTT ATCCGGTAAC TATCGTCTTG 1140AGTCCAACCC GGAAAGACAT GCAAAAGCAC CACTGGCAGC AGCCACTGGT AATTGATTTA 1200GAGGAGTTAG TCTTGAAGTC ATGCGCCGGT TAAGGCTAAA CTGAAAGGAC AAGTTTTGGT 1260GACTGCGCTC CTCCAAGCCA GTTACCTCGG TTCAAAGAGT TGGTAGCTCA GAGAACCTTC 1320GAAAAACCGC CCTGCAAGGC GGTTTTTTCG TTTTCAGAGC AAGAGATTAC GCGCAGACCA 1380AAACGATCTC AAGAAGATCA TCTTATTAAT CAGATAAAAT ATTTCTAGAT TTCAGTGCAA 1440TTTATCTCTT CAAATGTAGC ACCTGAAGTC AGCCCCATAC GATATAAGTT GTAATTCTCA 1500TGTTTGACAG CTTATCATCG GATGGATCTG AAATTGTAAA CGTTAATATT TTGTTAAATT 1560CGCGTTAAAT TTTTGTTAAA TCAGCTCATT TTTTAACCAA TAGGCCGAAA TCGGCAAAAT 1620CCCTTATAAA TCAAAAGAAT AGACCGAGAT AGGGTTGAGT GTTGTTCCAG TTTGGAACAA 1680GAGTCCACTA TTAAAGAACG TGGACTCCAA CGTCAAAGGG CGAAAAACCG TCTATCAGGG 1740CGATGGCCCA CTACGTGAAC CATCACCCTA ATCAAGTTTT TTGGGGTCGA GGTGCCGTAA 1800AGCACTAAAT CGGAACCCTA AAGGGAGCCC CCGATTTAGA GCTTGACGGG GAAAGCCGGC 1860GAACGTGGCG AGAAAGGAAG GGAAGAAAGC GAAAGGAGCG GGCGCTAGGG CGCTGGCAAG 1920TGTAGCGGTC ACGCTGCGCG TAACCACCAC ACCCGCCGCG CTTAATGCGC CGCTACAGGG 1980CGCGTCAGAT CCCATCGATA AGCTTTAATG CGGTAGTTTA TCACAGTTAA ATTGCTAACG 2040CAGTCAGGCA CCGTGTATGA AATCTAACAA TGCGCTCATC GTCATCCTCG GCACCGTCAC 2100CCTGGATGCT GTAGGCATAG GCTTGGTTAT GCCGGTACTG CCGGGCCTCT TGCGGGATAT 2160CGTCCATTCC GACAGCATCG CCAGTCACTA TGGCGTGCTG CTAGCGCTAT ATGCGTTGAT 2220GCAATTTCTA TGCGCACCCG TTCTCGGAGC ACTGTCCGAC CGCTTTGGCC GCCGCCCAGT 2280CCTGCTCGCT TCGCTACTTG GAGCCACTAT CGACTACGCG ATCATGGCGA CCACACCCGT 2340CCTGTGGATC CGCTGGCGAA AGGGGGATGT GCTGCAAGGC GATTAAGTTG GGTAACGCCA 2400GGGTTTTCCC AGTCACGACG TTGTAAAACG ACGGCCAGTG AATTGCGGCC GCCCTGCAAG 2460GAAGGGAATG TCGCCAACAG CGAAGAGAGT TGGGCAACGG ATGTGCTGGT GGAGGTGATC 2520GCCTCCTGAT GATGAGCCGC TCCCGATGTG GTGTCGGGAG CGGTATTTTC TATAAAACTT 2580ACCGCTTATT TGAGATATTC ATCGAAAATG TCGAGTAATT CTTGATGTAT ACACGGCCAT 2640TCCTGACCTA AATTGACGGT ACACAAGCCA ATATCGAAGC CATTAATTTT ATAACGATGT 2700TTCACTGCGG TATCTACGTG GGGATATATT AATAACCCCC CTATGTTTTC GCCATTTTCA 2760GGCTTTAACG ACCATAAGTA ATTCATCAGT TGATAAAGAT TTTGCGAATG AAATTTTTCT 2820GTTCCCATTC GTCGTGAAAA AATGCTCTTA TAGTATTTGG CGTCAACGAT AAGTATTTTT 2880TCTGATGAGC GAATGGTGAT GTCAGTTTCC ATTCGAGGTA ACAAATTAAG TGACTGATCC 2940GATATACTCG ATGCATCCCA TTTTAAATAA GAGCGGGTTG TGTTTGCAGA CGTTAATTCA 3000CGACGGCAAA ATTCATAAAG AAACTTTTGA TAAAGTAATG ACATCTCTTT TTCGTTTCTT 3060TCAAAATCAT AGAAACGGTA GTGTCCTTTG TTTTGACCTG GAATAGAATT ATTGACGATG 3120AATTTGCAGA CACTGATAAC GAATTTATAA TAACGCGTAT TTTTTCCGCC ATTCAGATAG 3180CTGAAATGCT GCGGAGTTAA ATGAAGAGTG CTAATGCCCG GTAATTTTCT ATAAAGTGAA 3240CGAGCTTCAT CTCTGATAGT TGAATTTAAC TTTTCATGCT TAATTAATAT GGCTAATGTG 3300CTTTTTATAA TTCGGTTAGC CAGCGTGTCT TCATTAAGCA TATCAAAAGT ACTGACGGTT 3360TTCCCATGAT TAAGATGGAA GCCGCGTATT GTTTTAGCAA ACTCTATTCG CCCTTTGATG 3420CCAGGAATGA TCTCGGTGTT AGGATTGTAA TCAAGCTCAA GCCCTCGGCG TGAAAGCTGT 3480AAAACCCCTT TATTTAATAC ATACCCCAGG ATATCAAGAA GATTGTTACC GGGTATGGCT 3540TCAAGGTTTG CCTGCTTAAT TTCCTGTAAA TAACCCCATG CATAGGTAAG CATGTAATAG 3600ATATTACGGA CAGGTATCAC GGGCTGTTCC ACTATGAGTC CCCTAATAAT TTGTTGGTCC 3660ATTTCTGTTG TTTATAGGGG TCATCAAAGA AATATTCTTC GAGTAAAGGG GCGATATCCG 3720TCATCACAAT TTCATTAAGC CATTGCGTAT CCGGAGAGGT GCCATCTTCC AACCCACGCA 3780AGAAGTAACT ATGCCCAATG CGGAATCCTT TCCCAAGGAT AGTGGCCTCT TTGCTGATTT 3840CCTGGTTCAA CTCGTTCATT TTTTGGCATA AAGACTCAAC AAATGAAGGT TCTGCTTTTT 3900TATTCAGTAA AAAATTCCGG AACTGTGGTG TATCAAAACC TGGCTCAATA TCTATGAAAG 3960AAAATCGTCT GCGTAGGGCA TAGTCAACAA CGGCCAGAGA GCGATCGGCA GTATTCATTA 4020AACCGATGAT ATAAACATTC TCCGGGACAT AGAATCGTTC TTCATCGTTT TCGGAGTAGG 4080TTAGGGGAAC AGACCAGTTT TCACCTCGTT TATCATGTTC CATTAACATC ATCACTTCGC 4140CAAATACTTT ACTGAGATTG GCACGATTGA TTTCATCTAT AATAAAAATA TACTTTTTCT 4200CTGGCTGCTC TTTAGCTTGC TGACAAAAAT TGTAAAATAT GCCGTCTTTA CGTCGGAAGC 4260CGACGCCATT CGGACGATAG CCCTGTATAA AATCCTCATA GCTATAAGAT TGATGGAACT 4320GAACCATATT GACGCGTTGC GGAGCCTTTT CTCCTGTCAG CAAGTAAGCC AGACGGCGTG 4380CAACAAAGGT TTTTCCAACG CCGGGCGGCC CCTGGAGGAT AATATTTTTT TTGATGGTTA 4440ATCGTTTGAG TATCGTCTCT ATTGTGGTTT CAGGGATAAA CAAATCATTT AACGCATCTT 4500CCAGACAGTA TGATTCAGTT TTTGACATAG GTGGAATAAC ACTCTTGCCA GAATTAAATA 4560TTAATTTATA GTCGTTGATT ATGTTGTCCA GCATAGAGGC AAATCGGGTG TAATCAATAC 4620CCTGTGAGAC TTTTTGGGAA CAGGCGTAAT AGGACTGTCC GTATTTTTTA GGATATACAC 4680CCGAAGTTGC CTGAAAATAC TCTGCGATTG TTTTAGGTAT GTCTGAAGAG AACTGCCATT 4740GGGCATGTGG TTCATTCGTG TCGCTTATAC CATAAGCCAA AACCAACTCA TCAAAATCTT 4800TATAATAGAG AATAACGGGA TATATACCGT TAGAAGCTTC CTGACCTTCT CCAAGAAATG 4860CAAACCAGGG AATAGACGTA AAATTACCAT AACCGAAACT CAATTTTACT CGCAGGTTAC 4920GGTAAGACGT TGGATAATCT TTAGTGGATT GCGAACGTTG TTGCTGTGCT TGCTTAATAA 4980ATTTTTCAAT CCAGGGTTGA ATAGATTCCA TAAGATATGC CTTCCTCATT GCTAAGCCTC 5040TATTATCGCT TTCGCAACGT ACTGAAACAA TAGATTTTTA CTGCAAAATC AGACTGGTAA 5100ATATTTACTG AGGGGGAAAG TTTCTATTGA GTCAGTGGAA GGCTCCCGGT GGTTAACCGG 5160GAGTAAACGC TGTTACGCGA CTTTCTGTTT ACCGGCAATC ACTCCAATAA ACGCCTGCAC 5220CTGCTTTTGT TTACGCGCCG ACAGTTTGCA CACCTGGCGT AGCGACTGCA TCAGTTCGCT 5280CTCCTCGGCG GCGGGTGGTT GGGCGGTGAG GACAATACAG CCTTCCATCA CTTTGACATC 5340TACCGCCGTG CCAGTGGCAA AACCGGCGGC TTCCAGCCAC TGACCTTTCA GGGTGATGGC 5400GGGAATACGG CTGTAATCCG GGTAGCGACT CGCATAACCG ACGGTGACAT GACGGTTATT 5460TGCCGGGGAG ACTTCTGCTT CGAACGGTTG TGCAATAGAA TGCGTGTCAG TCATAACTGC 5520TATTCTCCAG GAATAGTGAT TGTGATTAGC GATGCGGGTG TGTTGGCGCA CATCCGCACC 5580GCGCTAAATA CCTGTATATA TCATCAGTAA ATATGGGGAA AGTCCAGCTA AAAATAGAAT 5640AAAATGGGCA ATTTCTGGAA TGATTTAAAT ATATTTATGT GGGTTATGAT TGGCGTGAAA 5700TAATAAAAAG CGCACCGGAA AGGTGCGCCA GAAAATAATG TTCAGGATTT TTTACGTGAG 5760GCTTTTTTAC CCCCGCTAGC TGCGCGTTCA GCTTTGATTT TTTCCAGCAA CGCGGCGGCG 5820CTGTTTTCTC CGCTGATCAA ATCCGGGTTT TCGGCCCGCC ACTGGGCGGT AAGTTCACCA 5880CGGAACGCTT TTGCCAGGAT GGATTGCGTC AGGTTGTTGA CGCGGGCTAA GGCGTTGTTG 5940ACCTGTTTTT CTATGGTGTC GGCGTAGGCG AAGAGTTGCT CGACGCGGCG AACGATTTCG 6000GCTTGTTCTT TTACTGGAGG TAATAAAACA ACTTGGGATT TGATATCTTT TCCTGAAATA 6060CCTTTTTGAC CAGAAGTTGT TTTCACGCAG TTCATCATTG CATTTCGTGC TGAGGGGGAT 6120GAAAAAAATA TTTCGATATA TTCTGGTAAA GCATCTTTGG TTAATCGAGC TCGAATAAGT 6180TTATCAGGAT ATAGCAAATT TTGATGTTGT AATTTTTTCA ATAACCCACA AACACCAACA 6240AATTCTAAAC TTCCGTTATA GCGAGTAAAT AAAAGATCTC CATCTTGTAA TTTGTGGCGG 6300TTTAGTTCAC TTTCTGAACA TTCTAGAGTC GACCTGCAGG CATGCAAGCT TGGCGTAATC 6360ATGGTCATAG CTGTTTCCTG TGTGAAATTG TTATCCGCTC ACAATTCCAC ACAACATACG 6420AGCCGGAAGC ATAAAGTGTA AAGCCTGGGG TGCCTAATGA GTGAGCTAAC TCACATTAAT 6480TGCGTTGCGC TCACTGCCCG CTTTCCAGTC GGGAAACCTG TCGTGCCAGC GGATCCTCTA 6540CGCCGGACGC ATCGTGGCCG GCATCACCGG CGCCACAGGT GCGGTTGCTG GCGCCTATAT 6600CGCCGACATC ACCGATGGGG AAGATCGGGC TCGCCACTTC GGGCTCATGA GCGCTTGTTT 6660CGGCGTGGGT ATGGTGGCAG GCCCCGTGGC CGGGGGACTG TTGGGCGCCA TCTCCTTGCA 6720TGCACCATTC CTTGCGGCGG CGGTGCTCAA CGGCCTCAAC CTACTACTGG GCTGCTTCCT 6780AATGCAGGAG TCGCATAAGG GAGAGCGTCG ACCGATGCCC TTGAGAGCCT TCAACCCAGT 6840CAGCTCCTTC CGGTGGGCGC GGGGCATGAC TATCGTCGCC GCACTTATGA CTGTCTTCTT 6900TATCATGCAA CTCGTAGGAC AGGTGCCGGC AGCGCTCTGG GTCATTTTCG GCGAGGACCG 6960CTTTCGCTGG AGCGCGACGA TGATCGGCCT GTCGCTTGCG GTATTCGGAA TCTTGCACGC 7020CCTCGCTCAA GCCTTCGTCA CTGGTCCCGC CACCAAACGT TTCGGCGAGA AGCAGGCCAT 7080TATCGCCGGC ATGGCGGCCG ACGCGCTGGG CTACGTCTTG CTGGCGTTCG CGACGCGAGG 7140CTGGATGGCC TTCCCCATTA TGATTCTTCT CGCTTCCGGC GGCATCGGGA TGCCCGCGTT 7200GCAGGCCATG CTGTCCAGGC AGGTAGATGA CGACCATCAG GGACAGCTTC AAGGATCGCT 7260CGCGGCTCTT ACCAGCCTAA CTTCGATCAT TGGACCGCTG ATCGTCACGG CGATTTATGC 7320CGCCTCGGCG AGCACATGGA ACGGGTTGGC ATGGATTGTA GGCGCCGCCC TATACCTTGT 7380CTGCCTCCCC GCGTTGCGTC GCGGTGCATG GAGCCGGGCC ACCTCGACCT GAATGGAAGC 7440CGGCGGCACC TCGCTAACGG ATTCACCACT CCAAGAATTG GAGCCAATCA ATTCTTGCGG 7500AGAACTGTGA ATGCGCAAAC CAACCCTTGG CAGAACATAT CCATCGCGTC CGCCATCTCC 7560AGCAGCCGCA CGCGGCGCAT CTCGGGCAGC GTTGGGTCCT GGCCACGGGT GCGCATGATC 7620GTGCTCCTGT CGTTGAGGAC CCGGCTAGGC TGGCGGGGTT GCCTTACTGG TTAGCAGAAT 7680GAATCACCGA TACGCGAGCG AACGTGAAGC GACTGCTGCT GCAAAACGTC TGCGACCTGA 7740GCAACAACAT GAATGGTCTT CGGTTTCCGT GTTTCGTAAA GTCTGGAAAC GCGGAAGTCC 7800CCTACGTGCT GCTGAAGTTG CCCGCAACAG AGAGTGGAAC CAACCGGTGA TACCACGATA 7860CTATGACTGA GAGTCAACGC CATGAGCGGC CTCATTTCTT ATTCTGAGTT ACAACAGTCC 7920GCACCGCTGC CGGTAGCTCC TTCCGGTGGG CGCGGGGCAT GACTATCGTC GCCGCACTTA 7980TGACTGTCTT CTTTATCATG CAACTCGTAG GACAGGTGCC GGCAGCGCCC AACAGTCCCC 8040CGGCCACGGG GCCTGCCACC ATACCCACGC CGAAACAAGC GCCCTGCACC ATTATGTTCC 8100GGATCTGCAT CGCAGGATGC TGCTGGCTAC CCTGTGGAAC ACCTACATCT GTATTAACGA 8160AGCGCTAACC GTTTTTATCA GGCTCTGGGA GGCAGAATAA ATGATCATAT CGTCAATTAT 8220TACCTCCACG GGGAGAGCCT GAGCAAACTG GCCTCAGGCA TTTGAGAAGC ACACGGTCAT 8280ACTGCTTCCG GTAGTCAATA AACCGGTAAA CCAGCAATAG ACATAAGCGG CTATTTAACG 8340ACCCTGCCCT GAACCGACGA CCGGGTCGAA TTTGCTTTCG AATTTCTGCC ATTCATCCGC 8400TTATTATCAC TTATTCAGGC GTAGCAACCA GGCGTTTAAG GGCACCAATA ACTGCCTTAA 8460AAAAATTACG CCCCGCCCTG CCACTCATCG CAGTACTGTT GTAATTCATT AAGCATTCTG 8520CCGACATGGA AGCCATCACA GACGGCATGA TGAACCTGAA TCGCCAGCGG CATCAGCACC 8580TTGTCGCCTT GCGTATAATA TTTGCCCATG GTGAAAACGG GGGCGAAGAA GTTGTCCATA 8640TTGGCCACGT TTAAATCAAA ACTGGTGAAA CTCACCCAGG GATTGGCTGA GACGAAAAAC 8700ATATTCTCAA TAAACCCTTT AGGGAAATAG GCCAGGTTTT CACCGTAACA CGCCACATCT 8760TGCGAATATA TGTGTAGAAA CTGCCGGAAA TCGTCGTGGT ATTCACTCCA GAGCGATGAA 8820AACGTTTCAG TTTGCTCATG GAAAACGGTG TAACAAGGGT GAACACTATC CCATATCACC 8880AGCTCACCGT CTTTCATTGC CATACG 8906 3190 base pairs nucleic acid singlecircular other nucleic acid /desc = “pRM2 (target plasmid)” 6 GCGCCCAATACGCAAACCGC CTCTCCCCGC GCGTTGGCCG ATTCATTAAT GCAGCTGGCA 60 CGACAGGTTTCCCGACTGGA AAGCGGGCAG TGAGCGCAAC GCAATTAATG TGAGTTAGCT 120 CACTCATTAGGCACCCCAGG CTTTACACTT TATGCTTCCG GCTCGTATGT TGTGTGGAAT 180 TGTGAGCGGATAACAATTTC ACACAGGAAA CAGCTATGAC CATGATTACG AATTCGAGCT 240 CGGTACCCGGGGATCCTCTA GAGTCGAGAT GCCGCATGTG GAAGAGGTGA TTGCACCGAT 300 CTTCTACACCGTTCCGCTGC AGCTGCTGGC TTACCATGTC GCGCTGATCA AAGGCACCGA 360 CGTTGACCAGCCGCGTAACC TGGCAAAATC GGTTACGGTT GAGTAATAAA TGGATGCCCT 420 GCGTAAGCGGGGCATTTTTC TTCCTGTTAT GTTTTTAATC AAACATCCTG CCAACTCCAT 480 GTGACAAACCGTCATCTTCG GCTACTTTTT CTCTGTCACA GAATGAAAAT TTTCTGTCAT 540 CTCTTCGTTATTAATGTTTG TAATTGACTG AATATCAACG CTTATTTAAA TCAGACTGAA 600 GACTTATCTCTCTCTGTCAT AAAACTGTCA TATTCCTTAC ATATAACTGT CACCTGTTTG 660 TCCTATTTTGCTTGTCGTAG CCAACAAACA ATGCTTTATG AATCCTCCCA GGAGACATTA 720 TGAAAGTTATGCGTACCACC GTCGCAACTG TTGTCGCCGC GACCTTATCG ACCTGCAGGC 780 ATGCAAGCTTGGCACTGGCC GTCGTTTTAC AACGTCGTGA CTGGGAAAAC CCTGGCGTTA 840 CCCAACTTAATCGCCTTGCA GCACATCCCC CTTTCGCCAG CTGGCGTAAT AGCGAAGAGG 900 CCCGCACCGATCGCCCTTCC CAACAGTTGC GCAGCCTGAA TGGCGAATGG CGCCTGATGC 960 GGTATTTTCTCCTTACGCAT CTGTGCGGTA TTTCACACCG CATATGGTGC ACTCTCAGTA 1020 CAATCTGCTCTGATGCCGCA TAGTTAAGCC AGCCCCGACA CCCGCCAACA CCCGCTGACG 1080 CGCCCTGACGGGCTTGTCTG CTCCCGGCAT CCGCTTACAG ACAAGCTGTG ACCGTCTCCG 1140 GGAGCTGCATGTGTCAGAGG TTTTCACCGT CATCACCGAA ACGCGCGAGA CGAAAGGGCC 1200 TCGTGATACGCCTATTTTTA TAGGTTAATG TCATGATAAT AATGGTTTCT TAGACGTCAG 1260 GTGGCACTTTTCGGGGAAAT GTGCGCGGAA CCCCTATTTG TTTATTTTTC TAAATACATT 1320 CAAATATGTATCCGCTCATG AGACAATAAC CCTGATAAAT GCTTCAATAA TATTGAAAAA 1380 GGAAGAGTATGAGTATTCAA CATTTCCGTG TCGCCCTTAT TCCCTTTTTT GCGGCATTTT 1440 GCCTTCCTGTTTTTGCTCAC CCAGAAACGC TGGTGAAAGT AAAAGATGCT GAAGATCAGT 1500 TGGGTGCACGAGTGGGTTAC ATCGAACTGG ATCTCAACAG CGGTAAGATC CTTGAGAGTT 1560 TTCGCCCCGAAGAACGTTTT CCAATGATGA GCACTTTTAA AGTTCTGCTA TGTGGCGCGG 1620 TATTATCCCGTATTGACGCC GGGCAAGAGC AACTCGGTCG CCGCATACAC TATTCTCAGA 1680 ATGACTTGGTTGAGTACTCA CCAGTCACAG AAAAGCATCT TACGGATGGC ATGACAGTAA 1740 GAGAATTATGCAGTGCTGCC ATAACCATGA GTGATAACAC TGCGGCCAAC TTACTTCTGA 1800 CAACGATCGGAGGACCGAAG GAGCTAACCG CTTTTTTGCA CAACATGGGG GATCATGTAA 1860 CTCGCCTTGATCGTTGGGAA CCGGAGCTGA ATGAAGCCAT ACCAAACGAC GAGCGTGACA 1920 CCACGATGCCTGTAGCAATG GCAACAACGT TGCGCAAACT ATTAACTGGC GAACTACTTA 1980 CTCTAGCTTCCCGGCAACAA TTAATAGACT GGATGGAGGC GGATAAAGTT GCAGGACCAC 2040 TTCTGCGCTCGGCCCTTCCG GCTGGCTGGT TTATTGCTGA TAAATCTGGA GCCGGTGAGC 2100 GTGGGTCTCGCGGTATCATT GCAGCACTGG GGCCAGATGG TAAGCCCTCC CGTATCGTAG 2160 TTATCTACACGACGGGGAGT CAGGCAACTA TGGATGAACG AAATAGACAG ATCGCTGAGA 2220 TAGGTGCCTCACTGATTAAG CATTGGTAAC TGTCAGACCA AGTTTACTCA TATATACTTT 2280 AGATTGATTTAAAACTTCAT TTTTAATTTA AAAGGATCTA GGTGAAGATC CTTTTTGATA 2340 ATCTCATGACCAAAATCCCT TAACGTGAGT TTTCGTTCCA CTGAGCGTCA GACCCCGTAG 2400 AAAAGATCAAAGGATCTTCT TGAGATCCTT TTTTTCTGCG CGTAATCTGC TGCTTGCAAA 2460 CAAAAAAACCACCGCTACCA GCGGTGGTTT GTTTGCCGGA TCAAGAGCTA CCAACTCTTT 2520 TTCCGAAGGTAACTGGCTTC AGCAGAGCGC AGATACCAAA TACTGTCCTT CTAGTGTAGC 2580 CGTAGTTAGGCCACCACTTC AAGAACTCTG TAGCACCGCC TACATACCTC GCTCTGCTAA 2640 TCCTGTTACCAGTGGCTGCT GCCAGTGGCG ATAAGTCGTG TCTTACCGGG TTGGACTCAA 2700 GACGATAGTTACCGGATAAG GCGCAGCGGT CGGGCTGAAC GGGGGGTTCG TGCACACAGC 2760 CCAGCTTGGAGCGAACGACC TACACCGAAC TGAGATACCT ACAGCGTGAG CTATGAGAAA 2820 GCGCCACGCTTCCCGAAGGG AGAAAGGCGG ACAGGTATCC GGTAAGCGGC AGGGTCGGAA 2880 CAGGAGAGCGCACGAGGGAG CTTCCAGGGG GAAACGCCTG GTATCTTTAT AGTCCTGTCG 2940 GGTTTCGCCACCTCTGACTT GAGCGTCGAT TTTTGTGATG CTCGTCAGGG GGGCGGAGCC 3000 TATGGAAAAACGCCAGCAAC GCGGCCTTTT TACGGTTCCT GGCCTTTTGC TGGCCTTTTG 3060 CTCACATGTTCTTTCCTGCG TTATCCCCTG ATTCTGTGGA TAACCGTATT ACCGCCTTTG 3120 AGTGAGCTGATACCGCTCGC CGCAGCCGAA CGACCGAGCG CAGCGAGTCA GTGAGCGAGG 3180 AAGCGGAAGA3190 30 base pairs nucleic acid single linear other nucleic acid /desc =“Oligonucleotide (NLC95) 7 ATAATCCTTA AAAACTCCAT TTCCACCCCT 30 29 basepairs nucleic acid single linear other nucleic acid /desc =”Oligonucleotide (NLC209) 8 GTGATTGCAC CGATCTTCTA CACCGTTCC 29 30 basepairs nucleic acid single linear other nucleic acid /desc =“Oligonucleotide (NLC429) 9 TTTCACCGTC ATCACCGAAA CGCGCGAGAC 30 30 basepairs nucleic acid single linear other nucleic acid /desc =”Oligonucleotide (NLC430) 10 AATGACTTGG TTGAGTACTC ACCAGTCACA 30 30 basepairs nucleic acid single linear other nucleic acid /desc =“Oligonucleotide (NLC431) 11 ATGAACGAAA TAGACAGATC GCTGAGATAG 30 30 basepairs nucleic acid single linear other nucleic acid /desc =”Oligonucleotide (NLC432) 12 CAAGACGATA GTTACCGGAT AAGGCGCAGC 30 30 basepairs nucleic acid single linear other nucleic acid /desc =“Oligonucleotide (NLC94) 13 AAAGTCCAGT ATGCTTTTTC ACAGCATAAC 30 5 aminoacids amino acid single linear peptide 14 Asn Tyr Asn Arg Asn 1 5 5amino acids amino acid single linear peptide 15 Asn Tyr Thr Arg Asn 1 5

That what is claimed:
 1. An isolated ATP-utilizing regulatory proteinencoded by a transposon, said protein containing a mutation that allowsefficient and simple insertion of and reduced target site specificity ona transposable element derived from said transposon.
 2. The protein ofclaim 1 wherein said protein is TnsC.
 3. The protein of claim 2 whereinsaid mutation is valine at amino acid number
 225. 4. A transposonencoding an ATP-utilizing regulatory protein, the protein containing amutation that allows efficient and simple insertion of and reducedtarget site specificity on said transposon.
 5. The transposon of claim4, wherein said transposon is Tn7 or a derivative thereof.
 6. Thetransposon of claim 5 wherein said mutation is valine at amino acidnumber
 225. 7. A composition containing an isolated ATP-utilizingregulatory protein encoded by a transposon, said protein containing amutation that allows efficient and simple insertion of and reducedtarget site specificity on said transposon and a transposable elementderived from said transposon.
 8. The composition of claim 7 wherein saidprotein is TnsC.
 9. The composition of claim 8 wherein said mutation isvaline at amino acid number
 225. 10. A composition containing atransposon encoding an ATP-utilizing regulatory protein, the proteincontaining a mutation that allows efficient and simple insertion of andreduced target site specificity on said transposon.
 11. The compositionof claim 10 wherein said transposon is Tn7 or a derivative thereof. 12.The composition of claim 11 wherein said mutation is valine at aminoacid number
 225. 13. The composition according to claim 7 furthercontaining a transposon or transposable element derived therefromcapable of being activated by said protein.
 14. The composition of claim13 further comprising target DNA into which said transposon orderivative is capable of inserting.
 15. The composition of claim 13 inwhich said transposon or derivative contains at least one primer bindingsite that is native to said transposon or heterologous.
 16. Thecomposition of claim 15 further comprising primers that hybridize tosaid primer binding site on the transposon or derivative.
 17. Thecomposition of claim 13 in which said transposon or derivative containsa heterologous DNA sequence.
 18. The composition according to claim 10further comprising a target sequence into which said transposon iscapable of insertion.
 19. The composition according to claim 11 whichsaid transposon contains at least one primer binding site that is nativeto said transposon or heterologous.
 20. The composition of claim 19further comprising primers that hybridize to the primer binding site onsaid transposon.
 21. The composition according to claim 11 in which saidtransposon contains a heterologous DNA sequence.
 22. A compositioncontaining an isolated ATP-utilizing regulatory protein encoded by atransposon, said protein containing a mutation that allows efficient andsimple insertion of and reduced target site specificity on atransposable element derived from said transposon, further comprising atransposon or transposable derived therefrom capable of being activatedby said protein.
 23. The composition of claim 22 wherein said transposonis a Tn7 or a derivative thereof and said protein is TnsC.
 24. Thecomposition of claim 23 wherein said mutation is valine at amino acidnumber
 225. 25. A kit containing the composition according to claim 22.